Warning: Permanently added '18.205.192.221' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/7299609-fedora-38-x86_64 --chroot fedora-38-x86_64 Version: 0.72 PID: 16547 Logging PID: 16548 Task: {'allow_user_ssh': False, 'appstream': False, 'background': False, 'build_id': 7299609, 'buildroot_pkgs': [], 'chroot': 'fedora-38-x86_64', 'enable_net': True, 'fedora_review': False, 'git_hash': '991790ff47aeb285f03454bb0516f8b63a450726', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/rezso/ML/pytorch', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'pytorch', 'package_version': '2.4.0-20240412.0.git7efaf54d.cu12_3', 'project_dirname': 'ML', 'project_name': 'ML', 'project_owner': 'rezso', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/rezso/ML/fedora-38-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}, {'baseurl': 'https://download.copr.fedorainfracloud.org/results/rezso/CUDA/fedora-38-x86_64/', 'id': 'copr_rezso_CUDA', 'name': 'Additional repo copr_rezso_CUDA'}, {'baseurl': 'http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64', 'id': 'http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64', 'name': 'Additional repo http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64'}, {'baseurl': 'http://developer.download.nvidia.com/compute/cuda/repos/rhel8/sbsa', 'id': 'http_developer_download_nvidia_com_compute_cuda_repos_rhel8_sbsa', 'name': 'Additional repo http_developer_download_nvidia_com_compute_cuda_repos_rhel8_sbsa'}, {'baseurl': 'http://developer.download.nvidia.com/compute/cuda/repos/rhel8/ppc64le', 'id': 'http_developer_download_nvidia_com_compute_cuda_repos_rhel8_ppc64le', 'name': 'Additional repo http_developer_download_nvidia_com_compute_cuda_repos_rhel8_ppc64le'}], 'sandbox': 'rezso/ML--rezso', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'submitter': 'rezso', 'tags': [], 'task_id': '7299609-fedora-38-x86_64', 'timeout': 172800, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/rezso/ML/pytorch /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/rezso/ML/pytorch', '/var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch'... Running: git checkout 991790ff47aeb285f03454bb0516f8b63a450726 -- cmd: ['git', 'checkout', '991790ff47aeb285f03454bb0516f8b63a450726', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch rc: 0 stdout: stderr: Note: switching to '991790ff47aeb285f03454bb0516f8b63a450726'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at 991790f automatic import of pytorch Running: copr-distgit-client sources cmd: ['copr-distgit-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources /usr/bin/tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=172800): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch/pytorch.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1712885354.239135 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 5.5 starting (python version = 3.12.1, NVR = mock-5.5-1.fc39), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch/pytorch.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1712885354.239135 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch/pytorch.spec) Config(fedora-38-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 5.5 INFO: Mock Version: 5.5 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-38-x86_64-bootstrap-1712885354.239135/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using bootstrap image: registry.fedoraproject.org/fedora:38 INFO: Pulling image: registry.fedoraproject.org/fedora:38 INFO: Copy content of container registry.fedoraproject.org/fedora:38 to /var/lib/mock/fedora-38-x86_64-bootstrap-1712885354.239135/root INFO: Checking that registry.fedoraproject.org/fedora:38 image matches host's architecture INFO: mounting registry.fedoraproject.org/fedora:38 with podman image mount INFO: image registry.fedoraproject.org/fedora:38 as /var/lib/containers/storage/overlay/c12fdfa4ea5b7d6efc14a116bd49660835bd9a960dd1e256962eb1d88359408e/merged INFO: umounting image registry.fedoraproject.org/fedora:38 (/var/lib/containers/storage/overlay/c12fdfa4ea5b7d6efc14a116bd49660835bd9a960dd1e256962eb1d88359408e/merged) with podman image umount INFO: Package manager dnf detected and used (fallback) INFO: Bootstrap image not marked ready Start(bootstrap): installing dnf tooling No matches found for the following disable plugin patterns: local, spacewalk, versionlock Copr repository 14 MB/s | 961 kB 00:00 Additional repo copr_rezso_CUDA 1.3 MB/s | 71 kB 00:00 Additional repo http_developer_download_nvidia_ 137 MB/s | 3.3 MB 00:00 Additional repo http_developer_download_nvidia_ 112 MB/s | 2.0 MB 00:00 Additional repo http_developer_download_nvidia_ 87 MB/s | 1.8 MB 00:00 fedora 58 MB/s | 83 MB 00:01 updates 54 MB/s | 41 MB 00:00 Package python3-dnf-4.18.2-1.fc38.noarch is already installed. Dependencies resolved. ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: python3-dnf-plugins-core noarch 4.6.0-1.fc38 updates 323 k Upgrading: dnf noarch 4.19.2-1.fc38 updates 504 k dnf-data noarch 4.19.2-1.fc38 updates 39 k libdnf x86_64 0.73.1-1.fc38 updates 681 k python3-dnf noarch 4.19.2-1.fc38 updates 609 k python3-hawkey x86_64 0.73.1-1.fc38 updates 107 k python3-libdnf x86_64 0.73.1-1.fc38 updates 859 k yum noarch 4.19.2-1.fc38 updates 36 k Installing dependencies: dbus-libs x86_64 1:1.14.10-1.fc38 updates 156 k python3-dateutil noarch 1:2.8.2-5.fc38 fedora 360 k python3-dbus x86_64 1.3.2-2.fc38 fedora 157 k python3-distro noarch 1.8.0-2.fc38 fedora 49 k python3-six noarch 1.16.0-9.fc38 fedora 42 k python3-systemd x86_64 235-2.fc38 fedora 108 k Transaction Summary ================================================================================ Install 7 Packages Upgrade 7 Packages Total download size: 3.9 M Downloading Packages: (1/14): python3-distro-1.8.0-2.fc38.noarch.rpm 2.1 MB/s | 49 kB 00:00 (2/14): python3-six-1.16.0-9.fc38.noarch.rpm 15 MB/s | 42 kB 00:00 (3/14): python3-dbus-1.3.2-2.fc38.x86_64.rpm 5.3 MB/s | 157 kB 00:00 (4/14): python3-dateutil-2.8.2-5.fc38.noarch.rp 12 MB/s | 360 kB 00:00 (5/14): python3-systemd-235-2.fc38.x86_64.rpm 24 MB/s | 108 kB 00:00 (6/14): dbus-libs-1.14.10-1.fc38.x86_64.rpm 51 MB/s | 156 kB 00:00 (7/14): python3-dnf-plugins-core-4.6.0-1.fc38.n 49 MB/s | 323 kB 00:00 (8/14): dnf-4.19.2-1.fc38.noarch.rpm 61 MB/s | 504 kB 00:00 (9/14): dnf-data-4.19.2-1.fc38.noarch.rpm 5.0 MB/s | 39 kB 00:00 (10/14): python3-hawkey-0.73.1-1.fc38.x86_64.rp 17 MB/s | 107 kB 00:00 (11/14): python3-dnf-4.19.2-1.fc38.noarch.rpm 52 MB/s | 609 kB 00:00 (12/14): libdnf-0.73.1-1.fc38.x86_64.rpm 46 MB/s | 681 kB 00:00 (13/14): yum-4.19.2-1.fc38.noarch.rpm 4.6 MB/s | 36 kB 00:00 (14/14): python3-libdnf-0.73.1-1.fc38.x86_64.rp 45 MB/s | 859 kB 00:00 -------------------------------------------------------------------------------- Total 24 MB/s | 3.9 MB 00:00 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Upgrading : libdnf-0.73.1-1.fc38.x86_64 1/21 Upgrading : python3-libdnf-0.73.1-1.fc38.x86_64 2/21 Upgrading : python3-hawkey-0.73.1-1.fc38.x86_64 3/21 Upgrading : dnf-data-4.19.2-1.fc38.noarch 4/21 Upgrading : python3-dnf-4.19.2-1.fc38.noarch 5/21 Upgrading : dnf-4.19.2-1.fc38.noarch 6/21 Running scriptlet: dnf-4.19.2-1.fc38.noarch 6/21 Installing : dbus-libs-1:1.14.10-1.fc38.x86_64 7/21 Installing : python3-dbus-1.3.2-2.fc38.x86_64 8/21 Installing : python3-systemd-235-2.fc38.x86_64 9/21 Installing : python3-six-1.16.0-9.fc38.noarch 10/21 Installing : python3-dateutil-1:2.8.2-5.fc38.noarch 11/21 Installing : python3-distro-1.8.0-2.fc38.noarch 12/21 Installing : python3-dnf-plugins-core-4.6.0-1.fc38.noarch 13/21 Upgrading : yum-4.19.2-1.fc38.noarch 14/21 Cleanup : yum-4.18.2-1.fc38.noarch 15/21 Running scriptlet: dnf-4.18.2-1.fc38.noarch 16/21 Cleanup : dnf-4.18.2-1.fc38.noarch 16/21 Running scriptlet: dnf-4.18.2-1.fc38.noarch 16/21 Cleanup : python3-dnf-4.18.2-1.fc38.noarch 17/21 Cleanup : python3-hawkey-0.72.0-1.fc38.x86_64 18/21 Cleanup : dnf-data-4.18.2-1.fc38.noarch 19/21 Cleanup : python3-libdnf-0.72.0-1.fc38.x86_64 20/21 Cleanup : libdnf-0.72.0-1.fc38.x86_64 21/21 Running scriptlet: libdnf-0.72.0-1.fc38.x86_64 21/21 Verifying : python3-dateutil-1:2.8.2-5.fc38.noarch 1/21 Verifying : python3-dbus-1.3.2-2.fc38.x86_64 2/21 Verifying : python3-distro-1.8.0-2.fc38.noarch 3/21 Verifying : python3-six-1.16.0-9.fc38.noarch 4/21 Verifying : python3-systemd-235-2.fc38.x86_64 5/21 Verifying : dbus-libs-1:1.14.10-1.fc38.x86_64 6/21 Verifying : python3-dnf-plugins-core-4.6.0-1.fc38.noarch 7/21 Verifying : dnf-4.19.2-1.fc38.noarch 8/21 Verifying : dnf-4.18.2-1.fc38.noarch 9/21 Verifying : dnf-data-4.19.2-1.fc38.noarch 10/21 Verifying : dnf-data-4.18.2-1.fc38.noarch 11/21 Verifying : libdnf-0.73.1-1.fc38.x86_64 12/21 Verifying : libdnf-0.72.0-1.fc38.x86_64 13/21 Verifying : python3-dnf-4.19.2-1.fc38.noarch 14/21 Verifying : python3-dnf-4.18.2-1.fc38.noarch 15/21 Verifying : python3-hawkey-0.73.1-1.fc38.x86_64 16/21 Verifying : python3-hawkey-0.72.0-1.fc38.x86_64 17/21 Verifying : python3-libdnf-0.73.1-1.fc38.x86_64 18/21 Verifying : python3-libdnf-0.72.0-1.fc38.x86_64 19/21 Verifying : yum-4.19.2-1.fc38.noarch 20/21 Verifying : yum-4.18.2-1.fc38.noarch 21/21 Upgraded: dnf-4.19.2-1.fc38.noarch dnf-data-4.19.2-1.fc38.noarch libdnf-0.73.1-1.fc38.x86_64 python3-dnf-4.19.2-1.fc38.noarch python3-hawkey-0.73.1-1.fc38.x86_64 python3-libdnf-0.73.1-1.fc38.x86_64 yum-4.19.2-1.fc38.noarch Installed: dbus-libs-1:1.14.10-1.fc38.x86_64 python3-dateutil-1:2.8.2-5.fc38.noarch python3-dbus-1.3.2-2.fc38.x86_64 python3-distro-1.8.0-2.fc38.noarch python3-dnf-plugins-core-4.6.0-1.fc38.noarch python3-six-1.16.0-9.fc38.noarch python3-systemd-235-2.fc38.x86_64 Complete! Finish(bootstrap): installing dnf tooling Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-38-x86_64-1712885354.239135/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-4.18.2-1.fc38.x86_64 rpm-sequoia-1.5.0-2.fc38.x86_64 python3-dnf-4.19.2-1.fc38.noarch python3-dnf-plugins-core-4.6.0-1.fc38.noarch yum-4.19.2-1.fc38.noarch Start: installing minimal buildroot with dnf No matches found for the following disable plugin patterns: local, spacewalk, versionlock Copr repository 72 kB/s | 1.8 kB 00:00 Copr repository 15 MB/s | 961 kB 00:00 Additional repo copr_rezso_CUDA 72 kB/s | 1.8 kB 00:00 Additional repo http_developer_download_nvidia_ 741 kB/s | 3.5 kB 00:00 Additional repo http_developer_download_nvidia_ 222 kB/s | 3.5 kB 00:00 Additional repo http_developer_download_nvidia_ 747 kB/s | 3.5 kB 00:00 fedora 90 kB/s | 24 kB 00:00 updates 34 kB/s | 9.1 kB 00:00 Dependencies resolved. ================================================================================ Package Arch Version Repo Size ================================================================================ Installing group/module packages: bash x86_64 5.2.26-1.fc38 updates 1.8 M bzip2 x86_64 1.0.8-13.fc38 fedora 52 k coreutils x86_64 9.1-12.fc38 updates 1.1 M cpio x86_64 2.13-14.fc38 fedora 276 k diffutils x86_64 3.10-1.fc38 updates 398 k fedora-release-common noarch 38-36 updates 22 k findutils x86_64 1:4.9.0-3.fc38 fedora 492 k gawk x86_64 5.1.1-5.fc38 fedora 1.0 M glibc-minimal-langpack x86_64 2.37-18.fc38 updates 41 k grep x86_64 3.8-3.fc38 fedora 293 k gzip x86_64 1.12-3.fc38 fedora 166 k info x86_64 7.0.2-2.fc38 fedora 181 k patch x86_64 2.7.6-19.fc38 fedora 126 k redhat-rpm-config noarch 257-1.fc38 updates 76 k rpm-build x86_64 4.18.2-1.fc38 updates 76 k sed x86_64 4.8-12.fc38 fedora 306 k shadow-utils x86_64 2:4.13-6.fc38 fedora 1.3 M tar x86_64 2:1.34-8.fc38 fedora 889 k unzip x86_64 6.0-60.fc38 fedora 184 k util-linux x86_64 2.38.1-4.fc38 fedora 2.3 M which x86_64 2.21-39.fc38 fedora 42 k xz x86_64 5.4.1-1.fc38 fedora 419 k Installing dependencies: alternatives x86_64 1.26-1.fc38 updates 39 k ansible-srpm-macros noarch 1-12.fc38 updates 21 k audit-libs x86_64 3.1.2-8.fc38 updates 117 k authselect x86_64 1.4.3-1.fc38 updates 149 k authselect-libs x86_64 1.4.3-1.fc38 updates 249 k basesystem noarch 11-15.fc38 fedora 7.0 k binutils x86_64 2.39-16.fc38 updates 5.4 M binutils-gold x86_64 2.39-16.fc38 updates 795 k bzip2-libs x86_64 1.0.8-13.fc38 fedora 41 k ca-certificates noarch 2023.2.60_v7.0.306-1.0.fc38 updates 837 k coreutils-common x86_64 9.1-12.fc38 updates 2.0 M cracklib x86_64 2.9.11-1.fc38 updates 93 k crypto-policies noarch 20230301-1.gita12f7b2.fc38 fedora 93 k curl x86_64 8.0.1-7.fc38 updates 348 k cyrus-sasl-lib x86_64 2.1.28-9.fc38 fedora 794 k debugedit x86_64 5.0-9.fc38 updates 78 k dwz x86_64 0.15-2.fc38 fedora 135 k ed x86_64 1.19-2.fc38 fedora 78 k efi-srpm-macros noarch 5-7.fc38 fedora 22 k elfutils x86_64 0.191-1.fc38 updates 557 k elfutils-debuginfod-client x86_64 0.191-1.fc38 updates 37 k elfutils-default-yama-scope noarch 0.191-1.fc38 updates 12 k elfutils-libelf x86_64 0.191-1.fc38 updates 208 k elfutils-libs x86_64 0.191-1.fc38 updates 263 k fedora-gpg-keys noarch 38-1 fedora 126 k fedora-release noarch 38-36 updates 12 k fedora-release-identity-basic noarch 38-36 updates 13 k fedora-repos noarch 38-1 fedora 9.1 k file x86_64 5.44-3.fc38 fedora 49 k file-libs x86_64 5.44-3.fc38 fedora 730 k filesystem x86_64 3.18-3.fc38 fedora 1.1 M fonts-srpm-macros noarch 1:2.0.5-11.fc38 fedora 26 k forge-srpm-macros noarch 0.2.0-3.fc38 updates 19 k fpc-srpm-macros noarch 1.3-7.fc38 fedora 7.8 k gdb-minimal x86_64 14.1-3.fc38 updates 4.3 M gdbm-libs x86_64 1:1.23-3.fc38 fedora 56 k ghc-srpm-macros noarch 1.6.1-1.fc38 fedora 8.0 k glibc x86_64 2.37-18.fc38 updates 2.1 M glibc-common x86_64 2.37-18.fc38 updates 320 k glibc-gconv-extra x86_64 2.37-18.fc38 updates 1.6 M gmp x86_64 1:6.2.1-4.fc38 fedora 313 k gnat-srpm-macros noarch 6-2.fc38 fedora 8.8 k go-srpm-macros noarch 3.5.0-1.fc38 updates 27 k jansson x86_64 2.13.1-6.fc38 fedora 44 k kernel-srpm-macros noarch 1.0-19.fc38 updates 10 k keyutils-libs x86_64 1.6.3-1.fc38 updates 31 k krb5-libs x86_64 1.21-3.fc38 updates 764 k libacl x86_64 2.3.1-7.fc38 updates 23 k libarchive x86_64 3.6.1-4.fc38 fedora 400 k libattr x86_64 2.5.1-6.fc38 fedora 18 k libblkid x86_64 2.38.1-4.fc38 fedora 106 k libbrotli x86_64 1.0.9-11.fc38 fedora 317 k libcap x86_64 2.48-8.fc38 updates 68 k libcap-ng x86_64 0.8.3-8.fc38 updates 32 k libcom_err x86_64 1.46.5-4.fc38 fedora 26 k libcurl x86_64 8.0.1-7.fc38 updates 315 k libdb x86_64 5.3.28-55.fc38 fedora 758 k libeconf x86_64 0.5.2-1.fc38 updates 30 k libevent x86_64 2.1.12-8.fc38 fedora 257 k libfdisk x86_64 2.38.1-4.fc38 fedora 161 k libffi x86_64 3.4.4-2.fc38 fedora 38 k libgcc x86_64 13.2.1-7.fc38 updates 115 k libgomp x86_64 13.2.1-7.fc38 updates 324 k libidn2 x86_64 2.3.7-1.fc38 updates 119 k libmount x86_64 2.38.1-4.fc38 fedora 135 k libnghttp2 x86_64 1.52.0-2.fc38 updates 75 k libnsl2 x86_64 2.0.0-5.fc38 fedora 30 k libpkgconf x86_64 1.8.0-6.fc38 fedora 35 k libpsl x86_64 0.21.2-2.fc38 fedora 65 k libpwquality x86_64 1.4.5-3.fc38 fedora 119 k libselinux x86_64 3.5-1.fc38 fedora 87 k libsemanage x86_64 3.5-2.fc38 fedora 120 k libsepol x86_64 3.5-1.fc38 fedora 324 k libsigsegv x86_64 2.14-4.fc38 fedora 27 k libsmartcols x86_64 2.38.1-4.fc38 fedora 64 k libssh x86_64 0.10.6-2.fc38 updates 212 k libssh-config noarch 0.10.6-2.fc38 updates 9.0 k libstdc++ x86_64 13.2.1-7.fc38 updates 870 k libtasn1 x86_64 4.19.0-2.fc38 fedora 74 k libtirpc x86_64 1.3.4-1.rc3.fc38 updates 93 k libunistring x86_64 1.1-3.fc38 fedora 545 k libunistring1.0 x86_64 1.0-1.fc38 fedora 539 k libutempter x86_64 1.2.1-8.fc38 fedora 26 k libuuid x86_64 2.38.1-4.fc38 fedora 28 k libverto x86_64 0.3.2-5.fc38 fedora 21 k libxcrypt x86_64 4.4.36-1.fc38 updates 119 k libxml2 x86_64 2.10.4-1.fc38 updates 701 k libzstd x86_64 1.5.5-1.fc38 updates 308 k lua-libs x86_64 5.4.4-9.fc38 fedora 133 k lua-srpm-macros noarch 1-13.fc38 updates 8.7 k lz4-libs x86_64 1.9.4-2.fc38 fedora 67 k mpfr x86_64 4.1.1-3.fc38 fedora 600 k ncurses-base noarch 6.4-7.20230520.fc38.1 updates 88 k ncurses-libs x86_64 6.4-7.20230520.fc38.1 updates 336 k ocaml-srpm-macros noarch 7-3.fc38 fedora 13 k openblas-srpm-macros noarch 2-13.fc38 fedora 7.5 k openldap x86_64 2.6.6-1.fc38 updates 254 k openssl-libs x86_64 1:3.0.9-2.fc38 updates 2.1 M p11-kit x86_64 0.25.3-1.fc38 updates 521 k p11-kit-trust x86_64 0.25.3-1.fc38 updates 142 k package-notes-srpm-macros noarch 0.5-8.fc38 updates 11 k pam x86_64 1.5.2-16.fc38 fedora 560 k pam-libs x86_64 1.5.2-16.fc38 fedora 58 k pcre2 x86_64 10.42-1.fc38.1 fedora 234 k pcre2-syntax noarch 10.42-1.fc38.1 fedora 144 k perl-srpm-macros noarch 1-48.fc38 fedora 8.4 k pkgconf x86_64 1.8.0-6.fc38 fedora 41 k pkgconf-m4 noarch 1.8.0-6.fc38 fedora 14 k pkgconf-pkg-config x86_64 1.8.0-6.fc38 fedora 9.6 k popt x86_64 1.19-2.fc38 fedora 67 k publicsuffix-list-dafsa noarch 20240107-1.fc38 updates 58 k pyproject-srpm-macros noarch 1.12.0-1.fc38 updates 14 k python-srpm-macros noarch 3.11-10.fc38 fedora 26 k qt5-srpm-macros noarch 5.15.12-1.fc38 updates 8.3 k qt6-srpm-macros noarch 6.6.0-1.fc38 updates 8.6 k readline x86_64 8.2-4.fc38 updates 212 k rpm x86_64 4.18.2-1.fc38 updates 567 k rpm-build-libs x86_64 4.18.2-1.fc38 updates 93 k rpm-libs x86_64 4.18.2-1.fc38 updates 310 k rpm-sequoia x86_64 1.6.0-1.fc38 updates 872 k rpmautospec-rpm-macros noarch 0.6.3-1.fc38 updates 10 k rust-srpm-macros noarch 26.2-1.fc38 updates 13 k setup noarch 2.14.3-2.fc38 fedora 152 k sqlite-libs x86_64 3.40.1-2.fc38 fedora 666 k systemd-libs x86_64 253.17-1.fc38 updates 649 k tzdata noarch 2024a-1.fc38 updates 715 k util-linux-core x86_64 2.38.1-4.fc38 fedora 473 k xxhash-libs x86_64 0.8.2-1.fc38 updates 37 k xz-libs x86_64 5.4.1-1.fc38 fedora 108 k zip x86_64 3.0-37.fc38 updates 265 k zlib x86_64 1.2.13-3.fc38 fedora 95 k zstd x86_64 1.5.5-1.fc38 updates 482 k Installing Groups: Buildsystem building group Transaction Summary ================================================================================ Install 154 Packages Total size: 54 M Installed size: 187 M Downloading Packages: [SKIPPED] basesystem-11-15.fc38.noarch.rpm: Already downloaded [SKIPPED] bzip2-1.0.8-13.fc38.x86_64.rpm: Already downloaded [SKIPPED] bzip2-libs-1.0.8-13.fc38.x86_64.rpm: Already downloaded [SKIPPED] cpio-2.13-14.fc38.x86_64.rpm: Already downloaded [SKIPPED] crypto-policies-20230301-1.gita12f7b2.fc38.noarch.rpm: Already downloaded [SKIPPED] cyrus-sasl-lib-2.1.28-9.fc38.x86_64.rpm: Already downloaded [SKIPPED] dwz-0.15-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] ed-1.19-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] efi-srpm-macros-5-7.fc38.noarch.rpm: Already downloaded [SKIPPED] fedora-gpg-keys-38-1.noarch.rpm: Already downloaded [SKIPPED] fedora-repos-38-1.noarch.rpm: Already downloaded [SKIPPED] file-5.44-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] file-libs-5.44-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] filesystem-3.18-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] findutils-4.9.0-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] fonts-srpm-macros-2.0.5-11.fc38.noarch.rpm: Already downloaded [SKIPPED] fpc-srpm-macros-1.3-7.fc38.noarch.rpm: Already downloaded [SKIPPED] gawk-5.1.1-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] gdbm-libs-1.23-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] ghc-srpm-macros-1.6.1-1.fc38.noarch.rpm: Already downloaded [SKIPPED] gmp-6.2.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] gnat-srpm-macros-6-2.fc38.noarch.rpm: Already downloaded [SKIPPED] grep-3.8-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] gzip-1.12-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] info-7.0.2-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] jansson-2.13.1-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] libarchive-3.6.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libattr-2.5.1-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] libblkid-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libbrotli-1.0.9-11.fc38.x86_64.rpm: Already downloaded [SKIPPED] libcom_err-1.46.5-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libdb-5.3.28-55.fc38.x86_64.rpm: Already downloaded [SKIPPED] libevent-2.1.12-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] libfdisk-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libffi-3.4.4-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libmount-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libnsl2-2.0.0-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] libpkgconf-1.8.0-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] libpsl-0.21.2-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libpwquality-1.4.5-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] libselinux-3.5-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libsemanage-3.5-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libsepol-3.5-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libsigsegv-2.14-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libsmartcols-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libtasn1-4.19.0-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libunistring-1.1-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] libunistring1.0-1.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libutempter-1.2.1-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] libuuid-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libverto-0.3.2-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] lua-libs-5.4.4-9.fc38.x86_64.rpm: Already downloaded [SKIPPED] lz4-libs-1.9.4-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] mpfr-4.1.1-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] ocaml-srpm-macros-7-3.fc38.noarch.rpm: Already downloaded [SKIPPED] openblas-srpm-macros-2-13.fc38.noarch.rpm: Already downloaded [SKIPPED] pam-1.5.2-16.fc38.x86_64.rpm: Already downloaded [SKIPPED] pam-libs-1.5.2-16.fc38.x86_64.rpm: Already downloaded [SKIPPED] patch-2.7.6-19.fc38.x86_64.rpm: Already downloaded [SKIPPED] pcre2-10.42-1.fc38.1.x86_64.rpm: Already downloaded [SKIPPED] pcre2-syntax-10.42-1.fc38.1.noarch.rpm: Already downloaded [SKIPPED] perl-srpm-macros-1-48.fc38.noarch.rpm: Already downloaded [SKIPPED] pkgconf-1.8.0-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] pkgconf-m4-1.8.0-6.fc38.noarch.rpm: Already downloaded [SKIPPED] pkgconf-pkg-config-1.8.0-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] popt-1.19-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] python-srpm-macros-3.11-10.fc38.noarch.rpm: Already downloaded [SKIPPED] sed-4.8-12.fc38.x86_64.rpm: Already downloaded [SKIPPED] setup-2.14.3-2.fc38.noarch.rpm: Already downloaded [SKIPPED] shadow-utils-4.13-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] sqlite-libs-3.40.1-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] tar-1.34-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] unzip-6.0-60.fc38.x86_64.rpm: Already downloaded [SKIPPED] util-linux-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] util-linux-core-2.38.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] which-2.21-39.fc38.x86_64.rpm: Already downloaded [SKIPPED] xz-5.4.1-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] xz-libs-5.4.1-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] zlib-1.2.13-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] alternatives-1.26-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] ansible-srpm-macros-1-12.fc38.noarch.rpm: Already downloaded [SKIPPED] audit-libs-3.1.2-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] authselect-1.4.3-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] authselect-libs-1.4.3-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] bash-5.2.26-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] binutils-2.39-16.fc38.x86_64.rpm: Already downloaded [SKIPPED] binutils-gold-2.39-16.fc38.x86_64.rpm: Already downloaded [SKIPPED] ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noarch.rpm: Already downloaded [SKIPPED] coreutils-9.1-12.fc38.x86_64.rpm: Already downloaded [SKIPPED] coreutils-common-9.1-12.fc38.x86_64.rpm: Already downloaded [SKIPPED] cracklib-2.9.11-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] curl-8.0.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] debugedit-5.0-9.fc38.x86_64.rpm: Already downloaded [SKIPPED] diffutils-3.10-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] elfutils-0.191-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] elfutils-debuginfod-client-0.191-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] elfutils-default-yama-scope-0.191-1.fc38.noarch.rpm: Already downloaded [SKIPPED] elfutils-libelf-0.191-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] elfutils-libs-0.191-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] fedora-release-38-36.noarch.rpm: Already downloaded [SKIPPED] fedora-release-common-38-36.noarch.rpm: Already downloaded [SKIPPED] fedora-release-identity-basic-38-36.noarch.rpm: Already downloaded [SKIPPED] forge-srpm-macros-0.2.0-3.fc38.noarch.rpm: Already downloaded [SKIPPED] gdb-minimal-14.1-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] glibc-2.37-18.fc38.x86_64.rpm: Already downloaded [SKIPPED] glibc-common-2.37-18.fc38.x86_64.rpm: Already downloaded [SKIPPED] glibc-gconv-extra-2.37-18.fc38.x86_64.rpm: Already downloaded [SKIPPED] glibc-minimal-langpack-2.37-18.fc38.x86_64.rpm: Already downloaded [SKIPPED] go-srpm-macros-3.5.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] kernel-srpm-macros-1.0-19.fc38.noarch.rpm: Already downloaded [SKIPPED] keyutils-libs-1.6.3-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] krb5-libs-1.21-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] libacl-2.3.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libcap-2.48-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] libcap-ng-0.8.3-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] libcurl-8.0.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libeconf-0.5.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libgcc-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libgomp-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libidn2-2.3.7-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libnghttp2-1.52.0-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libssh-0.10.6-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libssh-config-0.10.6-2.fc38.noarch.rpm: Already downloaded [SKIPPED] libstdc++-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libtirpc-1.3.4-1.rc3.fc38.x86_64.rpm: Already downloaded [SKIPPED] libxcrypt-4.4.36-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libxml2-2.10.4-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libzstd-1.5.5-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] lua-srpm-macros-1-13.fc38.noarch.rpm: Already downloaded [SKIPPED] ncurses-base-6.4-7.20230520.fc38.1.noarch.rpm: Already downloaded [SKIPPED] ncurses-libs-6.4-7.20230520.fc38.1.x86_64.rpm: Already downloaded [SKIPPED] openldap-2.6.6-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] openssl-libs-3.0.9-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] p11-kit-0.25.3-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] p11-kit-trust-0.25.3-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] package-notes-srpm-macros-0.5-8.fc38.noarch.rpm: Already downloaded [SKIPPED] publicsuffix-list-dafsa-20240107-1.fc38.noarch.rpm: Already downloaded [SKIPPED] pyproject-srpm-macros-1.12.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] qt5-srpm-macros-5.15.12-1.fc38.noarch.rpm: Already downloaded [SKIPPED] qt6-srpm-macros-6.6.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] readline-8.2-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] redhat-rpm-config-257-1.fc38.noarch.rpm: Already downloaded [SKIPPED] rpm-4.18.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] rpm-build-4.18.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] rpm-build-libs-4.18.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] rpm-libs-4.18.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] rpm-sequoia-1.6.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] rpmautospec-rpm-macros-0.6.3-1.fc38.noarch.rpm: Already downloaded [SKIPPED] rust-srpm-macros-26.2-1.fc38.noarch.rpm: Already downloaded [SKIPPED] systemd-libs-253.17-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] tzdata-2024a-1.fc38.noarch.rpm: Already downloaded [SKIPPED] xxhash-libs-0.8.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] zip-3.0-37.fc38.x86_64.rpm: Already downloaded [SKIPPED] zstd-1.5.5-1.fc38.x86_64.rpm: Already downloaded fedora 1.6 MB/s | 1.6 kB 00:00 Importing GPG key 0xEB10B464: Userid : "Fedora (38) " Fingerprint: 6A51 BBAB BA3D 5467 B617 1221 809A 8D7C EB10 B464 From : /usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-38-primary Key imported successfully Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Running scriptlet: filesystem-3.18-3.fc38.x86_64 1/1 Preparing : 1/1 Installing : libgcc-13.2.1-7.fc38.x86_64 1/154 Running scriptlet: libgcc-13.2.1-7.fc38.x86_64 1/154 Installing : crypto-policies-20230301-1.gita12f7b2.fc38.noarc 2/154 Running scriptlet: crypto-policies-20230301-1.gita12f7b2.fc38.noarc 2/154 Installing : tzdata-2024a-1.fc38.noarch 3/154 Installing : fedora-release-identity-basic-38-36.noarch 4/154 Installing : rust-srpm-macros-26.2-1.fc38.noarch 5/154 Installing : qt6-srpm-macros-6.6.0-1.fc38.noarch 6/154 Installing : qt5-srpm-macros-5.15.12-1.fc38.noarch 7/154 Installing : publicsuffix-list-dafsa-20240107-1.fc38.noarch 8/154 Installing : package-notes-srpm-macros-0.5-8.fc38.noarch 9/154 Installing : ncurses-base-6.4-7.20230520.fc38.1.noarch 10/154 Installing : libssh-config-0.10.6-2.fc38.noarch 11/154 Installing : kernel-srpm-macros-1.0-19.fc38.noarch 12/154 Installing : coreutils-common-9.1-12.fc38.x86_64 13/154 Installing : ansible-srpm-macros-1-12.fc38.noarch 14/154 Installing : pkgconf-m4-1.8.0-6.fc38.noarch 15/154 Installing : perl-srpm-macros-1-48.fc38.noarch 16/154 Installing : pcre2-syntax-10.42-1.fc38.1.noarch 17/154 Installing : openblas-srpm-macros-2-13.fc38.noarch 18/154 Installing : ocaml-srpm-macros-7-3.fc38.noarch 19/154 Installing : gnat-srpm-macros-6-2.fc38.noarch 20/154 Installing : ghc-srpm-macros-1.6.1-1.fc38.noarch 21/154 Installing : fpc-srpm-macros-1.3-7.fc38.noarch 22/154 Installing : fedora-gpg-keys-38-1.noarch 23/154 Installing : fedora-release-38-36.noarch 24/154 Installing : fedora-repos-38-1.noarch 25/154 Installing : fedora-release-common-38-36.noarch 26/154 Installing : setup-2.14.3-2.fc38.noarch 27/154 Running scriptlet: setup-2.14.3-2.fc38.noarch 27/154 Installing : filesystem-3.18-3.fc38.x86_64 28/154 Installing : basesystem-11-15.fc38.noarch 29/154 Installing : glibc-gconv-extra-2.37-18.fc38.x86_64 30/154 Running scriptlet: glibc-gconv-extra-2.37-18.fc38.x86_64 30/154 Installing : glibc-minimal-langpack-2.37-18.fc38.x86_64 31/154 Installing : glibc-common-2.37-18.fc38.x86_64 32/154 Running scriptlet: glibc-2.37-18.fc38.x86_64 33/154 Installing : glibc-2.37-18.fc38.x86_64 33/154 Running scriptlet: glibc-2.37-18.fc38.x86_64 33/154 Installing : ncurses-libs-6.4-7.20230520.fc38.1.x86_64 34/154 Installing : bash-5.2.26-1.fc38.x86_64 35/154 Running scriptlet: bash-5.2.26-1.fc38.x86_64 35/154 Installing : zlib-1.2.13-3.fc38.x86_64 36/154 Installing : xz-libs-5.4.1-1.fc38.x86_64 37/154 Installing : bzip2-libs-1.0.8-13.fc38.x86_64 38/154 Installing : libzstd-1.5.5-1.fc38.x86_64 39/154 Installing : elfutils-libelf-0.191-1.fc38.x86_64 40/154 Installing : libuuid-2.38.1-4.fc38.x86_64 41/154 Installing : popt-1.19-2.fc38.x86_64 42/154 Installing : libstdc++-13.2.1-7.fc38.x86_64 43/154 Installing : libblkid-2.38.1-4.fc38.x86_64 44/154 Installing : readline-8.2-4.fc38.x86_64 45/154 Installing : gmp-1:6.2.1-4.fc38.x86_64 46/154 Installing : libattr-2.5.1-6.fc38.x86_64 47/154 Installing : libacl-2.3.1-7.fc38.x86_64 48/154 Installing : libcap-2.48-8.fc38.x86_64 49/154 Installing : libxcrypt-4.4.36-1.fc38.x86_64 50/154 Installing : lz4-libs-1.9.4-2.fc38.x86_64 51/154 Installing : libeconf-0.5.2-1.fc38.x86_64 52/154 Installing : systemd-libs-253.17-1.fc38.x86_64 53/154 Installing : mpfr-4.1.1-3.fc38.x86_64 54/154 Installing : dwz-0.15-2.fc38.x86_64 55/154 Installing : unzip-6.0-60.fc38.x86_64 56/154 Installing : file-libs-5.44-3.fc38.x86_64 57/154 Installing : file-5.44-3.fc38.x86_64 58/154 Installing : sqlite-libs-3.40.1-2.fc38.x86_64 59/154 Installing : libcom_err-1.46.5-4.fc38.x86_64 60/154 Installing : libsepol-3.5-1.fc38.x86_64 61/154 Installing : libsmartcols-2.38.1-4.fc38.x86_64 62/154 Installing : libtasn1-4.19.0-2.fc38.x86_64 63/154 Installing : lua-libs-5.4.4-9.fc38.x86_64 64/154 Installing : pcre2-10.42-1.fc38.1.x86_64 65/154 Installing : libselinux-3.5-1.fc38.x86_64 66/154 Installing : sed-4.8-12.fc38.x86_64 67/154 Installing : grep-3.8-3.fc38.x86_64 68/154 Installing : findutils-1:4.9.0-3.fc38.x86_64 69/154 Installing : xz-5.4.1-1.fc38.x86_64 70/154 Installing : libmount-2.38.1-4.fc38.x86_64 71/154 Installing : alternatives-1.26-1.fc38.x86_64 72/154 Installing : libcap-ng-0.8.3-8.fc38.x86_64 73/154 Installing : audit-libs-3.1.2-8.fc38.x86_64 74/154 Installing : pam-libs-1.5.2-16.fc38.x86_64 75/154 Installing : libsemanage-3.5-2.fc38.x86_64 76/154 Installing : shadow-utils-2:4.13-6.fc38.x86_64 77/154 Running scriptlet: libutempter-1.2.1-8.fc38.x86_64 78/154 Installing : libutempter-1.2.1-8.fc38.x86_64 78/154 Installing : util-linux-core-2.38.1-4.fc38.x86_64 79/154 Installing : tar-2:1.34-8.fc38.x86_64 80/154 Installing : zip-3.0-37.fc38.x86_64 81/154 Installing : zstd-1.5.5-1.fc38.x86_64 82/154 Installing : libfdisk-2.38.1-4.fc38.x86_64 83/154 Installing : bzip2-1.0.8-13.fc38.x86_64 84/154 Installing : libxml2-2.10.4-1.fc38.x86_64 85/154 Installing : ed-1.19-2.fc38.x86_64 86/154 Installing : patch-2.7.6-19.fc38.x86_64 87/154 Installing : elfutils-default-yama-scope-0.191-1.fc38.noarch 88/154 Running scriptlet: elfutils-default-yama-scope-0.191-1.fc38.noarch 88/154 Installing : cpio-2.13-14.fc38.x86_64 89/154 Installing : gdbm-libs-1:1.23-3.fc38.x86_64 90/154 Installing : cyrus-sasl-lib-2.1.28-9.fc38.x86_64 91/154 Installing : jansson-2.13.1-6.fc38.x86_64 92/154 Installing : libbrotli-1.0.9-11.fc38.x86_64 93/154 Installing : libdb-5.3.28-55.fc38.x86_64 94/154 Installing : libffi-3.4.4-2.fc38.x86_64 95/154 Installing : p11-kit-0.25.3-1.fc38.x86_64 96/154 Installing : p11-kit-trust-0.25.3-1.fc38.x86_64 97/154 Running scriptlet: p11-kit-trust-0.25.3-1.fc38.x86_64 97/154 Installing : openssl-libs-1:3.0.9-2.fc38.x86_64 98/154 Installing : coreutils-9.1-12.fc38.x86_64 99/154 Running scriptlet: ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noar 100/154 Installing : ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noar 100/154 Running scriptlet: ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noar 100/154 Installing : gzip-1.12-3.fc38.x86_64 101/154 Running scriptlet: authselect-libs-1.4.3-1.fc38.x86_64 102/154 Installing : authselect-libs-1.4.3-1.fc38.x86_64 102/154 Installing : libarchive-3.6.1-4.fc38.x86_64 103/154 Installing : rpm-sequoia-1.6.0-1.fc38.x86_64 104/154 Installing : rpm-libs-4.18.2-1.fc38.x86_64 105/154 Installing : authselect-1.4.3-1.fc38.x86_64 106/154 Installing : cracklib-2.9.11-1.fc38.x86_64 107/154 Installing : libpwquality-1.4.5-3.fc38.x86_64 108/154 Installing : libevent-2.1.12-8.fc38.x86_64 109/154 Installing : openldap-2.6.6-1.fc38.x86_64 110/154 Installing : libpkgconf-1.8.0-6.fc38.x86_64 111/154 Installing : pkgconf-1.8.0-6.fc38.x86_64 112/154 Installing : pkgconf-pkg-config-1.8.0-6.fc38.x86_64 113/154 Installing : libsigsegv-2.14-4.fc38.x86_64 114/154 Installing : gawk-5.1.1-5.fc38.x86_64 115/154 Installing : libunistring-1.1-3.fc38.x86_64 116/154 Installing : libidn2-2.3.7-1.fc38.x86_64 117/154 Installing : libunistring1.0-1.0-1.fc38.x86_64 118/154 Installing : libpsl-0.21.2-2.fc38.x86_64 119/154 Installing : libverto-0.3.2-5.fc38.x86_64 120/154 Installing : diffutils-3.10-1.fc38.x86_64 121/154 Installing : keyutils-libs-1.6.3-1.fc38.x86_64 122/154 Installing : krb5-libs-1.21-3.fc38.x86_64 123/154 Installing : libtirpc-1.3.4-1.rc3.fc38.x86_64 124/154 Installing : libnsl2-2.0.0-5.fc38.x86_64 125/154 Installing : pam-1.5.2-16.fc38.x86_64 126/154 Installing : libssh-0.10.6-2.fc38.x86_64 127/154 Installing : libgomp-13.2.1-7.fc38.x86_64 128/154 Installing : libnghttp2-1.52.0-2.fc38.x86_64 129/154 Installing : libcurl-8.0.1-7.fc38.x86_64 130/154 Installing : elfutils-libs-0.191-1.fc38.x86_64 131/154 Installing : elfutils-debuginfod-client-0.191-1.fc38.x86_64 132/154 Installing : binutils-gold-2.39-16.fc38.x86_64 133/154 Installing : binutils-2.39-16.fc38.x86_64 134/154 Running scriptlet: binutils-2.39-16.fc38.x86_64 134/154 Installing : elfutils-0.191-1.fc38.x86_64 135/154 Installing : rpm-build-libs-4.18.2-1.fc38.x86_64 136/154 Installing : curl-8.0.1-7.fc38.x86_64 137/154 Running scriptlet: rpm-4.18.2-1.fc38.x86_64 138/154 Installing : rpm-4.18.2-1.fc38.x86_64 138/154 Installing : efi-srpm-macros-5-7.fc38.noarch 139/154 Installing : lua-srpm-macros-1-13.fc38.noarch 140/154 Installing : rpmautospec-rpm-macros-0.6.3-1.fc38.noarch 141/154 Installing : xxhash-libs-0.8.2-1.fc38.x86_64 142/154 Installing : gdb-minimal-14.1-3.fc38.x86_64 143/154 Installing : debugedit-5.0-9.fc38.x86_64 144/154 Installing : fonts-srpm-macros-1:2.0.5-11.fc38.noarch 145/154 Installing : python-srpm-macros-3.11-10.fc38.noarch 146/154 Installing : forge-srpm-macros-0.2.0-3.fc38.noarch 147/154 Installing : go-srpm-macros-3.5.0-1.fc38.noarch 148/154 Installing : redhat-rpm-config-257-1.fc38.noarch 149/154 Installing : rpm-build-4.18.2-1.fc38.x86_64 150/154 Installing : pyproject-srpm-macros-1.12.0-1.fc38.noarch 151/154 Installing : util-linux-2.38.1-4.fc38.x86_64 152/154 Installing : which-2.21-39.fc38.x86_64 153/154 Installing : info-7.0.2-2.fc38.x86_64 154/154 Running scriptlet: filesystem-3.18-3.fc38.x86_64 154/154 Running scriptlet: ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noar 154/154 Running scriptlet: authselect-libs-1.4.3-1.fc38.x86_64 154/154 Running scriptlet: rpm-4.18.2-1.fc38.x86_64 154/154 Running scriptlet: info-7.0.2-2.fc38.x86_64 154/154 Verifying : basesystem-11-15.fc38.noarch 1/154 Verifying : bzip2-1.0.8-13.fc38.x86_64 2/154 Verifying : bzip2-libs-1.0.8-13.fc38.x86_64 3/154 Verifying : cpio-2.13-14.fc38.x86_64 4/154 Verifying : crypto-policies-20230301-1.gita12f7b2.fc38.noarc 5/154 Verifying : cyrus-sasl-lib-2.1.28-9.fc38.x86_64 6/154 Verifying : dwz-0.15-2.fc38.x86_64 7/154 Verifying : ed-1.19-2.fc38.x86_64 8/154 Verifying : efi-srpm-macros-5-7.fc38.noarch 9/154 Verifying : fedora-gpg-keys-38-1.noarch 10/154 Verifying : fedora-repos-38-1.noarch 11/154 Verifying : file-5.44-3.fc38.x86_64 12/154 Verifying : file-libs-5.44-3.fc38.x86_64 13/154 Verifying : filesystem-3.18-3.fc38.x86_64 14/154 Verifying : findutils-1:4.9.0-3.fc38.x86_64 15/154 Verifying : fonts-srpm-macros-1:2.0.5-11.fc38.noarch 16/154 Verifying : fpc-srpm-macros-1.3-7.fc38.noarch 17/154 Verifying : gawk-5.1.1-5.fc38.x86_64 18/154 Verifying : gdbm-libs-1:1.23-3.fc38.x86_64 19/154 Verifying : ghc-srpm-macros-1.6.1-1.fc38.noarch 20/154 Verifying : gmp-1:6.2.1-4.fc38.x86_64 21/154 Verifying : gnat-srpm-macros-6-2.fc38.noarch 22/154 Verifying : grep-3.8-3.fc38.x86_64 23/154 Verifying : gzip-1.12-3.fc38.x86_64 24/154 Verifying : info-7.0.2-2.fc38.x86_64 25/154 Verifying : jansson-2.13.1-6.fc38.x86_64 26/154 Verifying : libarchive-3.6.1-4.fc38.x86_64 27/154 Verifying : libattr-2.5.1-6.fc38.x86_64 28/154 Verifying : libblkid-2.38.1-4.fc38.x86_64 29/154 Verifying : libbrotli-1.0.9-11.fc38.x86_64 30/154 Verifying : libcom_err-1.46.5-4.fc38.x86_64 31/154 Verifying : libdb-5.3.28-55.fc38.x86_64 32/154 Verifying : libevent-2.1.12-8.fc38.x86_64 33/154 Verifying : libfdisk-2.38.1-4.fc38.x86_64 34/154 Verifying : libffi-3.4.4-2.fc38.x86_64 35/154 Verifying : libmount-2.38.1-4.fc38.x86_64 36/154 Verifying : libnsl2-2.0.0-5.fc38.x86_64 37/154 Verifying : libpkgconf-1.8.0-6.fc38.x86_64 38/154 Verifying : libpsl-0.21.2-2.fc38.x86_64 39/154 Verifying : libpwquality-1.4.5-3.fc38.x86_64 40/154 Verifying : libselinux-3.5-1.fc38.x86_64 41/154 Verifying : libsemanage-3.5-2.fc38.x86_64 42/154 Verifying : libsepol-3.5-1.fc38.x86_64 43/154 Verifying : libsigsegv-2.14-4.fc38.x86_64 44/154 Verifying : libsmartcols-2.38.1-4.fc38.x86_64 45/154 Verifying : libtasn1-4.19.0-2.fc38.x86_64 46/154 Verifying : libunistring-1.1-3.fc38.x86_64 47/154 Verifying : libunistring1.0-1.0-1.fc38.x86_64 48/154 Verifying : libutempter-1.2.1-8.fc38.x86_64 49/154 Verifying : libuuid-2.38.1-4.fc38.x86_64 50/154 Verifying : libverto-0.3.2-5.fc38.x86_64 51/154 Verifying : lua-libs-5.4.4-9.fc38.x86_64 52/154 Verifying : lz4-libs-1.9.4-2.fc38.x86_64 53/154 Verifying : mpfr-4.1.1-3.fc38.x86_64 54/154 Verifying : ocaml-srpm-macros-7-3.fc38.noarch 55/154 Verifying : openblas-srpm-macros-2-13.fc38.noarch 56/154 Verifying : pam-1.5.2-16.fc38.x86_64 57/154 Verifying : pam-libs-1.5.2-16.fc38.x86_64 58/154 Verifying : patch-2.7.6-19.fc38.x86_64 59/154 Verifying : pcre2-10.42-1.fc38.1.x86_64 60/154 Verifying : pcre2-syntax-10.42-1.fc38.1.noarch 61/154 Verifying : perl-srpm-macros-1-48.fc38.noarch 62/154 Verifying : pkgconf-1.8.0-6.fc38.x86_64 63/154 Verifying : pkgconf-m4-1.8.0-6.fc38.noarch 64/154 Verifying : pkgconf-pkg-config-1.8.0-6.fc38.x86_64 65/154 Verifying : popt-1.19-2.fc38.x86_64 66/154 Verifying : python-srpm-macros-3.11-10.fc38.noarch 67/154 Verifying : sed-4.8-12.fc38.x86_64 68/154 Verifying : setup-2.14.3-2.fc38.noarch 69/154 Verifying : shadow-utils-2:4.13-6.fc38.x86_64 70/154 Verifying : sqlite-libs-3.40.1-2.fc38.x86_64 71/154 Verifying : tar-2:1.34-8.fc38.x86_64 72/154 Verifying : unzip-6.0-60.fc38.x86_64 73/154 Verifying : util-linux-2.38.1-4.fc38.x86_64 74/154 Verifying : util-linux-core-2.38.1-4.fc38.x86_64 75/154 Verifying : which-2.21-39.fc38.x86_64 76/154 Verifying : xz-5.4.1-1.fc38.x86_64 77/154 Verifying : xz-libs-5.4.1-1.fc38.x86_64 78/154 Verifying : zlib-1.2.13-3.fc38.x86_64 79/154 Verifying : alternatives-1.26-1.fc38.x86_64 80/154 Verifying : ansible-srpm-macros-1-12.fc38.noarch 81/154 Verifying : audit-libs-3.1.2-8.fc38.x86_64 82/154 Verifying : authselect-1.4.3-1.fc38.x86_64 83/154 Verifying : authselect-libs-1.4.3-1.fc38.x86_64 84/154 Verifying : bash-5.2.26-1.fc38.x86_64 85/154 Verifying : binutils-2.39-16.fc38.x86_64 86/154 Verifying : binutils-gold-2.39-16.fc38.x86_64 87/154 Verifying : ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noar 88/154 Verifying : coreutils-9.1-12.fc38.x86_64 89/154 Verifying : coreutils-common-9.1-12.fc38.x86_64 90/154 Verifying : cracklib-2.9.11-1.fc38.x86_64 91/154 Verifying : curl-8.0.1-7.fc38.x86_64 92/154 Verifying : debugedit-5.0-9.fc38.x86_64 93/154 Verifying : diffutils-3.10-1.fc38.x86_64 94/154 Verifying : elfutils-0.191-1.fc38.x86_64 95/154 Verifying : elfutils-debuginfod-client-0.191-1.fc38.x86_64 96/154 Verifying : elfutils-default-yama-scope-0.191-1.fc38.noarch 97/154 Verifying : elfutils-libelf-0.191-1.fc38.x86_64 98/154 Verifying : elfutils-libs-0.191-1.fc38.x86_64 99/154 Verifying : fedora-release-38-36.noarch 100/154 Verifying : fedora-release-common-38-36.noarch 101/154 Verifying : fedora-release-identity-basic-38-36.noarch 102/154 Verifying : forge-srpm-macros-0.2.0-3.fc38.noarch 103/154 Verifying : gdb-minimal-14.1-3.fc38.x86_64 104/154 Verifying : glibc-2.37-18.fc38.x86_64 105/154 Verifying : glibc-common-2.37-18.fc38.x86_64 106/154 Verifying : glibc-gconv-extra-2.37-18.fc38.x86_64 107/154 Verifying : glibc-minimal-langpack-2.37-18.fc38.x86_64 108/154 Verifying : go-srpm-macros-3.5.0-1.fc38.noarch 109/154 Verifying : kernel-srpm-macros-1.0-19.fc38.noarch 110/154 Verifying : keyutils-libs-1.6.3-1.fc38.x86_64 111/154 Verifying : krb5-libs-1.21-3.fc38.x86_64 112/154 Verifying : libacl-2.3.1-7.fc38.x86_64 113/154 Verifying : libcap-2.48-8.fc38.x86_64 114/154 Verifying : libcap-ng-0.8.3-8.fc38.x86_64 115/154 Verifying : libcurl-8.0.1-7.fc38.x86_64 116/154 Verifying : libeconf-0.5.2-1.fc38.x86_64 117/154 Verifying : libgcc-13.2.1-7.fc38.x86_64 118/154 Verifying : libgomp-13.2.1-7.fc38.x86_64 119/154 Verifying : libidn2-2.3.7-1.fc38.x86_64 120/154 Verifying : libnghttp2-1.52.0-2.fc38.x86_64 121/154 Verifying : libssh-0.10.6-2.fc38.x86_64 122/154 Verifying : libssh-config-0.10.6-2.fc38.noarch 123/154 Verifying : libstdc++-13.2.1-7.fc38.x86_64 124/154 Verifying : libtirpc-1.3.4-1.rc3.fc38.x86_64 125/154 Verifying : libxcrypt-4.4.36-1.fc38.x86_64 126/154 Verifying : libxml2-2.10.4-1.fc38.x86_64 127/154 Verifying : libzstd-1.5.5-1.fc38.x86_64 128/154 Verifying : lua-srpm-macros-1-13.fc38.noarch 129/154 Verifying : ncurses-base-6.4-7.20230520.fc38.1.noarch 130/154 Verifying : ncurses-libs-6.4-7.20230520.fc38.1.x86_64 131/154 Verifying : openldap-2.6.6-1.fc38.x86_64 132/154 Verifying : openssl-libs-1:3.0.9-2.fc38.x86_64 133/154 Verifying : p11-kit-0.25.3-1.fc38.x86_64 134/154 Verifying : p11-kit-trust-0.25.3-1.fc38.x86_64 135/154 Verifying : package-notes-srpm-macros-0.5-8.fc38.noarch 136/154 Verifying : publicsuffix-list-dafsa-20240107-1.fc38.noarch 137/154 Verifying : pyproject-srpm-macros-1.12.0-1.fc38.noarch 138/154 Verifying : qt5-srpm-macros-5.15.12-1.fc38.noarch 139/154 Verifying : qt6-srpm-macros-6.6.0-1.fc38.noarch 140/154 Verifying : readline-8.2-4.fc38.x86_64 141/154 Verifying : redhat-rpm-config-257-1.fc38.noarch 142/154 Verifying : rpm-4.18.2-1.fc38.x86_64 143/154 Verifying : rpm-build-4.18.2-1.fc38.x86_64 144/154 Verifying : rpm-build-libs-4.18.2-1.fc38.x86_64 145/154 Verifying : rpm-libs-4.18.2-1.fc38.x86_64 146/154 Verifying : rpm-sequoia-1.6.0-1.fc38.x86_64 147/154 Verifying : rpmautospec-rpm-macros-0.6.3-1.fc38.noarch 148/154 Verifying : rust-srpm-macros-26.2-1.fc38.noarch 149/154 Verifying : systemd-libs-253.17-1.fc38.x86_64 150/154 Verifying : tzdata-2024a-1.fc38.noarch 151/154 Verifying : xxhash-libs-0.8.2-1.fc38.x86_64 152/154 Verifying : zip-3.0-37.fc38.x86_64 153/154 Verifying : zstd-1.5.5-1.fc38.x86_64 154/154 Installed: alternatives-1.26-1.fc38.x86_64 ansible-srpm-macros-1-12.fc38.noarch audit-libs-3.1.2-8.fc38.x86_64 authselect-1.4.3-1.fc38.x86_64 authselect-libs-1.4.3-1.fc38.x86_64 basesystem-11-15.fc38.noarch bash-5.2.26-1.fc38.x86_64 binutils-2.39-16.fc38.x86_64 binutils-gold-2.39-16.fc38.x86_64 bzip2-1.0.8-13.fc38.x86_64 bzip2-libs-1.0.8-13.fc38.x86_64 ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noarch coreutils-9.1-12.fc38.x86_64 coreutils-common-9.1-12.fc38.x86_64 cpio-2.13-14.fc38.x86_64 cracklib-2.9.11-1.fc38.x86_64 crypto-policies-20230301-1.gita12f7b2.fc38.noarch curl-8.0.1-7.fc38.x86_64 cyrus-sasl-lib-2.1.28-9.fc38.x86_64 debugedit-5.0-9.fc38.x86_64 diffutils-3.10-1.fc38.x86_64 dwz-0.15-2.fc38.x86_64 ed-1.19-2.fc38.x86_64 efi-srpm-macros-5-7.fc38.noarch elfutils-0.191-1.fc38.x86_64 elfutils-debuginfod-client-0.191-1.fc38.x86_64 elfutils-default-yama-scope-0.191-1.fc38.noarch elfutils-libelf-0.191-1.fc38.x86_64 elfutils-libs-0.191-1.fc38.x86_64 fedora-gpg-keys-38-1.noarch fedora-release-38-36.noarch fedora-release-common-38-36.noarch fedora-release-identity-basic-38-36.noarch fedora-repos-38-1.noarch file-5.44-3.fc38.x86_64 file-libs-5.44-3.fc38.x86_64 filesystem-3.18-3.fc38.x86_64 findutils-1:4.9.0-3.fc38.x86_64 fonts-srpm-macros-1:2.0.5-11.fc38.noarch forge-srpm-macros-0.2.0-3.fc38.noarch fpc-srpm-macros-1.3-7.fc38.noarch gawk-5.1.1-5.fc38.x86_64 gdb-minimal-14.1-3.fc38.x86_64 gdbm-libs-1:1.23-3.fc38.x86_64 ghc-srpm-macros-1.6.1-1.fc38.noarch glibc-2.37-18.fc38.x86_64 glibc-common-2.37-18.fc38.x86_64 glibc-gconv-extra-2.37-18.fc38.x86_64 glibc-minimal-langpack-2.37-18.fc38.x86_64 gmp-1:6.2.1-4.fc38.x86_64 gnat-srpm-macros-6-2.fc38.noarch go-srpm-macros-3.5.0-1.fc38.noarch grep-3.8-3.fc38.x86_64 gzip-1.12-3.fc38.x86_64 info-7.0.2-2.fc38.x86_64 jansson-2.13.1-6.fc38.x86_64 kernel-srpm-macros-1.0-19.fc38.noarch keyutils-libs-1.6.3-1.fc38.x86_64 krb5-libs-1.21-3.fc38.x86_64 libacl-2.3.1-7.fc38.x86_64 libarchive-3.6.1-4.fc38.x86_64 libattr-2.5.1-6.fc38.x86_64 libblkid-2.38.1-4.fc38.x86_64 libbrotli-1.0.9-11.fc38.x86_64 libcap-2.48-8.fc38.x86_64 libcap-ng-0.8.3-8.fc38.x86_64 libcom_err-1.46.5-4.fc38.x86_64 libcurl-8.0.1-7.fc38.x86_64 libdb-5.3.28-55.fc38.x86_64 libeconf-0.5.2-1.fc38.x86_64 libevent-2.1.12-8.fc38.x86_64 libfdisk-2.38.1-4.fc38.x86_64 libffi-3.4.4-2.fc38.x86_64 libgcc-13.2.1-7.fc38.x86_64 libgomp-13.2.1-7.fc38.x86_64 libidn2-2.3.7-1.fc38.x86_64 libmount-2.38.1-4.fc38.x86_64 libnghttp2-1.52.0-2.fc38.x86_64 libnsl2-2.0.0-5.fc38.x86_64 libpkgconf-1.8.0-6.fc38.x86_64 libpsl-0.21.2-2.fc38.x86_64 libpwquality-1.4.5-3.fc38.x86_64 libselinux-3.5-1.fc38.x86_64 libsemanage-3.5-2.fc38.x86_64 libsepol-3.5-1.fc38.x86_64 libsigsegv-2.14-4.fc38.x86_64 libsmartcols-2.38.1-4.fc38.x86_64 libssh-0.10.6-2.fc38.x86_64 libssh-config-0.10.6-2.fc38.noarch libstdc++-13.2.1-7.fc38.x86_64 libtasn1-4.19.0-2.fc38.x86_64 libtirpc-1.3.4-1.rc3.fc38.x86_64 libunistring-1.1-3.fc38.x86_64 libunistring1.0-1.0-1.fc38.x86_64 libutempter-1.2.1-8.fc38.x86_64 libuuid-2.38.1-4.fc38.x86_64 libverto-0.3.2-5.fc38.x86_64 libxcrypt-4.4.36-1.fc38.x86_64 libxml2-2.10.4-1.fc38.x86_64 libzstd-1.5.5-1.fc38.x86_64 lua-libs-5.4.4-9.fc38.x86_64 lua-srpm-macros-1-13.fc38.noarch lz4-libs-1.9.4-2.fc38.x86_64 mpfr-4.1.1-3.fc38.x86_64 ncurses-base-6.4-7.20230520.fc38.1.noarch ncurses-libs-6.4-7.20230520.fc38.1.x86_64 ocaml-srpm-macros-7-3.fc38.noarch openblas-srpm-macros-2-13.fc38.noarch openldap-2.6.6-1.fc38.x86_64 openssl-libs-1:3.0.9-2.fc38.x86_64 p11-kit-0.25.3-1.fc38.x86_64 p11-kit-trust-0.25.3-1.fc38.x86_64 package-notes-srpm-macros-0.5-8.fc38.noarch pam-1.5.2-16.fc38.x86_64 pam-libs-1.5.2-16.fc38.x86_64 patch-2.7.6-19.fc38.x86_64 pcre2-10.42-1.fc38.1.x86_64 pcre2-syntax-10.42-1.fc38.1.noarch perl-srpm-macros-1-48.fc38.noarch pkgconf-1.8.0-6.fc38.x86_64 pkgconf-m4-1.8.0-6.fc38.noarch pkgconf-pkg-config-1.8.0-6.fc38.x86_64 popt-1.19-2.fc38.x86_64 publicsuffix-list-dafsa-20240107-1.fc38.noarch pyproject-srpm-macros-1.12.0-1.fc38.noarch python-srpm-macros-3.11-10.fc38.noarch qt5-srpm-macros-5.15.12-1.fc38.noarch qt6-srpm-macros-6.6.0-1.fc38.noarch readline-8.2-4.fc38.x86_64 redhat-rpm-config-257-1.fc38.noarch rpm-4.18.2-1.fc38.x86_64 rpm-build-4.18.2-1.fc38.x86_64 rpm-build-libs-4.18.2-1.fc38.x86_64 rpm-libs-4.18.2-1.fc38.x86_64 rpm-sequoia-1.6.0-1.fc38.x86_64 rpmautospec-rpm-macros-0.6.3-1.fc38.noarch rust-srpm-macros-26.2-1.fc38.noarch sed-4.8-12.fc38.x86_64 setup-2.14.3-2.fc38.noarch shadow-utils-2:4.13-6.fc38.x86_64 sqlite-libs-3.40.1-2.fc38.x86_64 systemd-libs-253.17-1.fc38.x86_64 tar-2:1.34-8.fc38.x86_64 tzdata-2024a-1.fc38.noarch unzip-6.0-60.fc38.x86_64 util-linux-2.38.1-4.fc38.x86_64 util-linux-core-2.38.1-4.fc38.x86_64 which-2.21-39.fc38.x86_64 xxhash-libs-0.8.2-1.fc38.x86_64 xz-5.4.1-1.fc38.x86_64 xz-libs-5.4.1-1.fc38.x86_64 zip-3.0-37.fc38.x86_64 zlib-1.2.13-3.fc38.x86_64 zstd-1.5.5-1.fc38.x86_64 Complete! Finish: installing minimal buildroot with dnf Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: alternatives-1.26-1.fc38.x86_64 ansible-srpm-macros-1-12.fc38.noarch audit-libs-3.1.2-8.fc38.x86_64 authselect-1.4.3-1.fc38.x86_64 authselect-libs-1.4.3-1.fc38.x86_64 basesystem-11-15.fc38.noarch bash-5.2.26-1.fc38.x86_64 binutils-2.39-16.fc38.x86_64 binutils-gold-2.39-16.fc38.x86_64 bzip2-1.0.8-13.fc38.x86_64 bzip2-libs-1.0.8-13.fc38.x86_64 ca-certificates-2023.2.60_v7.0.306-1.0.fc38.noarch coreutils-9.1-12.fc38.x86_64 coreutils-common-9.1-12.fc38.x86_64 cpio-2.13-14.fc38.x86_64 cracklib-2.9.11-1.fc38.x86_64 crypto-policies-20230301-1.gita12f7b2.fc38.noarch curl-8.0.1-7.fc38.x86_64 cyrus-sasl-lib-2.1.28-9.fc38.x86_64 debugedit-5.0-9.fc38.x86_64 diffutils-3.10-1.fc38.x86_64 dwz-0.15-2.fc38.x86_64 ed-1.19-2.fc38.x86_64 efi-srpm-macros-5-7.fc38.noarch elfutils-0.191-1.fc38.x86_64 elfutils-debuginfod-client-0.191-1.fc38.x86_64 elfutils-default-yama-scope-0.191-1.fc38.noarch elfutils-libelf-0.191-1.fc38.x86_64 elfutils-libs-0.191-1.fc38.x86_64 fedora-gpg-keys-38-1.noarch fedora-release-38-36.noarch fedora-release-common-38-36.noarch fedora-release-identity-basic-38-36.noarch fedora-repos-38-1.noarch file-5.44-3.fc38.x86_64 file-libs-5.44-3.fc38.x86_64 filesystem-3.18-3.fc38.x86_64 findutils-4.9.0-3.fc38.x86_64 fonts-srpm-macros-2.0.5-11.fc38.noarch forge-srpm-macros-0.2.0-3.fc38.noarch fpc-srpm-macros-1.3-7.fc38.noarch gawk-5.1.1-5.fc38.x86_64 gdb-minimal-14.1-3.fc38.x86_64 gdbm-libs-1.23-3.fc38.x86_64 ghc-srpm-macros-1.6.1-1.fc38.noarch glibc-2.37-18.fc38.x86_64 glibc-common-2.37-18.fc38.x86_64 glibc-gconv-extra-2.37-18.fc38.x86_64 glibc-minimal-langpack-2.37-18.fc38.x86_64 gmp-6.2.1-4.fc38.x86_64 gnat-srpm-macros-6-2.fc38.noarch go-srpm-macros-3.5.0-1.fc38.noarch gpg-pubkey-eb10b464-6202d9c6 grep-3.8-3.fc38.x86_64 gzip-1.12-3.fc38.x86_64 info-7.0.2-2.fc38.x86_64 jansson-2.13.1-6.fc38.x86_64 kernel-srpm-macros-1.0-19.fc38.noarch keyutils-libs-1.6.3-1.fc38.x86_64 krb5-libs-1.21-3.fc38.x86_64 libacl-2.3.1-7.fc38.x86_64 libarchive-3.6.1-4.fc38.x86_64 libattr-2.5.1-6.fc38.x86_64 libblkid-2.38.1-4.fc38.x86_64 libbrotli-1.0.9-11.fc38.x86_64 libcap-2.48-8.fc38.x86_64 libcap-ng-0.8.3-8.fc38.x86_64 libcom_err-1.46.5-4.fc38.x86_64 libcurl-8.0.1-7.fc38.x86_64 libdb-5.3.28-55.fc38.x86_64 libeconf-0.5.2-1.fc38.x86_64 libevent-2.1.12-8.fc38.x86_64 libfdisk-2.38.1-4.fc38.x86_64 libffi-3.4.4-2.fc38.x86_64 libgcc-13.2.1-7.fc38.x86_64 libgomp-13.2.1-7.fc38.x86_64 libidn2-2.3.7-1.fc38.x86_64 libmount-2.38.1-4.fc38.x86_64 libnghttp2-1.52.0-2.fc38.x86_64 libnsl2-2.0.0-5.fc38.x86_64 libpkgconf-1.8.0-6.fc38.x86_64 libpsl-0.21.2-2.fc38.x86_64 libpwquality-1.4.5-3.fc38.x86_64 libselinux-3.5-1.fc38.x86_64 libsemanage-3.5-2.fc38.x86_64 libsepol-3.5-1.fc38.x86_64 libsigsegv-2.14-4.fc38.x86_64 libsmartcols-2.38.1-4.fc38.x86_64 libssh-0.10.6-2.fc38.x86_64 libssh-config-0.10.6-2.fc38.noarch libstdc++-13.2.1-7.fc38.x86_64 libtasn1-4.19.0-2.fc38.x86_64 libtirpc-1.3.4-1.rc3.fc38.x86_64 libunistring-1.1-3.fc38.x86_64 libunistring1.0-1.0-1.fc38.x86_64 libutempter-1.2.1-8.fc38.x86_64 libuuid-2.38.1-4.fc38.x86_64 libverto-0.3.2-5.fc38.x86_64 libxcrypt-4.4.36-1.fc38.x86_64 libxml2-2.10.4-1.fc38.x86_64 libzstd-1.5.5-1.fc38.x86_64 lua-libs-5.4.4-9.fc38.x86_64 lua-srpm-macros-1-13.fc38.noarch lz4-libs-1.9.4-2.fc38.x86_64 mpfr-4.1.1-3.fc38.x86_64 ncurses-base-6.4-7.20230520.fc38.1.noarch ncurses-libs-6.4-7.20230520.fc38.1.x86_64 ocaml-srpm-macros-7-3.fc38.noarch openblas-srpm-macros-2-13.fc38.noarch openldap-2.6.6-1.fc38.x86_64 openssl-libs-3.0.9-2.fc38.x86_64 p11-kit-0.25.3-1.fc38.x86_64 p11-kit-trust-0.25.3-1.fc38.x86_64 package-notes-srpm-macros-0.5-8.fc38.noarch pam-1.5.2-16.fc38.x86_64 pam-libs-1.5.2-16.fc38.x86_64 patch-2.7.6-19.fc38.x86_64 pcre2-10.42-1.fc38.1.x86_64 pcre2-syntax-10.42-1.fc38.1.noarch perl-srpm-macros-1-48.fc38.noarch pkgconf-1.8.0-6.fc38.x86_64 pkgconf-m4-1.8.0-6.fc38.noarch pkgconf-pkg-config-1.8.0-6.fc38.x86_64 popt-1.19-2.fc38.x86_64 publicsuffix-list-dafsa-20240107-1.fc38.noarch pyproject-srpm-macros-1.12.0-1.fc38.noarch python-srpm-macros-3.11-10.fc38.noarch qt5-srpm-macros-5.15.12-1.fc38.noarch qt6-srpm-macros-6.6.0-1.fc38.noarch readline-8.2-4.fc38.x86_64 redhat-rpm-config-257-1.fc38.noarch rpm-4.18.2-1.fc38.x86_64 rpm-build-4.18.2-1.fc38.x86_64 rpm-build-libs-4.18.2-1.fc38.x86_64 rpm-libs-4.18.2-1.fc38.x86_64 rpm-sequoia-1.6.0-1.fc38.x86_64 rpmautospec-rpm-macros-0.6.3-1.fc38.noarch rust-srpm-macros-26.2-1.fc38.noarch sed-4.8-12.fc38.x86_64 setup-2.14.3-2.fc38.noarch shadow-utils-4.13-6.fc38.x86_64 sqlite-libs-3.40.1-2.fc38.x86_64 systemd-libs-253.17-1.fc38.x86_64 tar-1.34-8.fc38.x86_64 tzdata-2024a-1.fc38.noarch unzip-6.0-60.fc38.x86_64 util-linux-2.38.1-4.fc38.x86_64 util-linux-core-2.38.1-4.fc38.x86_64 which-2.21-39.fc38.x86_64 xxhash-libs-0.8.2-1.fc38.x86_64 xz-5.4.1-1.fc38.x86_64 xz-libs-5.4.1-1.fc38.x86_64 zip-3.0-37.fc38.x86_64 zlib-1.2.13-3.fc38.x86_64 zstd-1.5.5-1.fc38.x86_64 Start: buildsrpm Start: rpmbuild -bs warning: %patchN is deprecated (2 usages found), use %patch N (or %patch -P N) Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1554595200 Wrote: /builddir/build/SRPMS/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm RPM build warnings: %patchN is deprecated (2 usages found), use %patch N (or %patch -P N) Finish: rpmbuild -bs cp: preserving permissions for ‘/var/lib/copr-rpmbuild/results/chroot_scan/var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log’: No such file or directory INFO: chroot_scan: 3 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log/dnf.rpm.log /var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log/dnf.librepo.log /var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log/dnf.log Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-k3kry55z/pytorch/pytorch.spec) Config(child) 0 minutes 42 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm) Config(fedora-38-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-38-x86_64-bootstrap-1712885354.239135/root. INFO: reusing tmpfs at /var/lib/mock/fedora-38-x86_64-bootstrap-1712885354.239135/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-38-x86_64-1712885354.239135/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-4.18.2-1.fc38.x86_64 rpm-sequoia-1.5.0-2.fc38.x86_64 python3-dnf-4.19.2-1.fc38.noarch python3-dnf-plugins-core-4.6.0-1.fc38.noarch yum-4.19.2-1.fc38.noarch Finish: chroot init Start: build phase for pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm Start: build setup for pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm warning: %patchN is deprecated (2 usages found), use %patch N (or %patch -P N) Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1554595200 Wrote: /builddir/build/SRPMS/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm RPM build warnings: %patchN is deprecated (2 usages found), use %patch N (or %patch -P N) No matches found for the following disable plugin patterns: local, spacewalk, versionlock Copr repository 70 kB/s | 1.8 kB 00:00 Additional repo copr_rezso_CUDA 70 kB/s | 1.8 kB 00:00 Additional repo http_developer_download_nvidia_ 759 kB/s | 3.5 kB 00:00 Additional repo http_developer_download_nvidia_ 629 kB/s | 3.5 kB 00:00 Additional repo http_developer_download_nvidia_ 555 kB/s | 3.5 kB 00:00 fedora 150 kB/s | 24 kB 00:00 updates 203 kB/s | 9.1 kB 00:00 Dependencies resolved. ==================================================================================================================================================================== Package Arch Version Repository Size ==================================================================================================================================================================== Installing: asmjit-devel x86_64 1:0-20220702.1.gitc5984762.fc38 copr_base 230 k cpuinfo-devel x86_64 1:0-20240327.0.gitf42f5eaf.fc38 copr_base 24 k cuda-cudart-devel-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 2.0 M cuda-cupti-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 29 M cuda-driver-devel-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 42 k cuda-gcc-12-c++ x86_64 12.3.1-1.fc38 copr_base 15 M cuda-nvcc-12-3 x86_64 12.3.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 64 M cuda-nvml-devel-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 121 k cuda-nvrtc-devel-12-3 x86_64 12.3.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 22 M cuda-nvtx-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 88 k cuda-profiler-api-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 26 k cutlass-devel x86_64 3.4.1-20240215.0.cu12_3.fc38 copr_base 774 k doxygen x86_64 2:1.9.6-7.fc38 fedora 4.8 M eigen3-devel noarch 3.4.0-9.fc38 fedora 1.2 M fbgemm-devel x86_64 0.7.0-20240315.0.git0049a2ca.fc38 copr_base 63 k fftw-devel x86_64 3.3.10-10.fc38 updates 135 k flatbuffers-compiler x86_64 23.3.3-1.fc38 fedora 1.0 M flatbuffers-devel x86_64 23.3.3-1.fc38 fedora 107 k foxi-devel x86_64 0-20210526.1.gitc278588e.fc37 copr_base 24 k fp16-devel x86_64 1:0-20240410.0.git581ac1c7.fc38 copr_base 13 k fxdiv-devel noarch 1:0-20201208.1.git63058eff.fc38 copr_base 12 k gcc-c++ x86_64 13.2.1-7.fc38 updates 13 M gemmlowp-devel noarch 0-20231104.0.git16e8662c.fc38 copr_base 157 k gflags-devel x86_64 2.2.2-11.fc38 fedora 24 k git x86_64 2.44.0-1.fc38 updates 53 k glog-devel x86_64 0.3.5-17.fc38 fedora 38 k gloo-devel x86_64 1:0.5.0-20240302.0.git2565674c.cu12_3.fc38 copr_base 74 k gmp-devel x86_64 1:6.2.1-4.fc38 fedora 173 k hiredis-devel x86_64 1.0.2-4.fc38 fedora 37 k kineto-devel x86_64 0.4.0-20240327.0.git445909a8.cu12_3.fc38 copr_base 23 k leveldb-devel x86_64 1.23-6.fc38 fedora 53 k libcublas-devel-12-3 x86_64 12.3.4.1-2 copr_rezso_CUDA 88 k libcudnn8-devel x86_64 8.9.7.29-2.cuda12.3 copr_rezso_CUDA 34 k libcufft-devel-12-3 x86_64 11.0.12.1-2 copr_rezso_CUDA 34 k libcurand-devel-12-3 x86_64 10.3.4.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 53 M libcusolver-devel-12-3 x86_64 11.5.4.101-2 copr_rezso_CUDA 60 k libcusparse-devel-12-3 x86_64 12.2.0.103-2 copr_rezso_CUDA 108 M libnccl-devel x86_64 2.21.5-1+cuda12.4 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 16 k libnvjitlink-devel-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 18 M libuv-devel x86_64 1:1.48.0-1.fc38 updates 42 k libzstd-devel x86_64 1.5.5-1.fc38 updates 51 k lmdb-devel x86_64 0.9.32-1.fc38 updates 26 k magma-devel x86_64 2.8.0-20240328.0.cu12_3.fc38 copr_base 976 k mesa-libGLU-devel x86_64 9.0.3-1.fc38 updates 12 k miniz-devel x86_64 3.0.2-2.fc38 fedora 33 k mpfr-devel x86_64 4.1.1-3.fc38 fedora 21 k neon2sse-devel noarch 0-20230131.0.git097a5eca.fc38 copr_base 85 k nnpack-devel x86_64 0-20230201.0.git70a77f48.fc38 copr_base 16 k numactl-devel x86_64 2.0.16-2.fc38 fedora 22 k ocl-icd-devel x86_64 2.3.2-1.fc38 updates 63 k onnx-devel x86_64 1.17.0-20240404.0.git4128a090.fc38 copr_base 129 k onnx-optimizer-devel x86_64 0.3.19-20240303.0.gitb3a46118.fc38 copr_base 50 k openblas-devel x86_64 0.3.21-4.fc38 fedora 81 k openblas-openmp x86_64 0.3.21-4.fc38 fedora 5.1 M opencv-devel x86_64 4.9.0-20231227.1.cu12_3.fc38 copr_base 1.3 M peachpy-python3 noarch 0-20221113.1.git349e8f83.fc38 copr_base 700 k protobuf-compat-compiler x86_64 3.21.9-2.fc38 copr_base 919 k protobuf-compat-devel x86_64 3.21.9-2.fc38 copr_base 374 k psimd-devel noarch 1:0-20200517.2.git072586a7.fc38 copr_base 13 k pthreadpool-devel x86_64 1:0.1-20240121.0.git178e3e06.fc38 copr_base 15 k pybind11-devel x86_64 2.10.3-2.fc38 fedora 172 k python3-devel x86_64 3.11.8-2.fc38 updates 270 k python3-numpy x86_64 1:1.24.4-1.fc38 updates 7.9 M python3-pybind11 x86_64 2.10.3-2.fc38 fedora 194 k python3-pyyaml x86_64 6.0-6.fc38 fedora 225 k python3-setuptools noarch 65.5.1-2.fc38 fedora 1.7 M python3-six noarch 1.16.0-9.fc38 fedora 42 k python3-typing-extensions noarch 4.5.0-1.fc38 fedora 63 k qnnpack-devel x86_64 0-20190828.2.git7d2a4e99.fc38 copr_base 12 k rdma-core-devel x86_64 44.0-3.fc38 fedora 418 k rocksdb-devel x86_64 7.8.3-1.fc38 fedora 285 k sleef-devel x86_64 3.6-20240320.0.git60e76d2b.fc38 copr_base 28 k snappy-devel x86_64 1.1.9-7.fc38 fedora 21 k tbb-devel x86_64 2020.3-16.fc38 fedora 335 k tensorpipe-devel x86_64 0-20220513.1.gitbb1473a4.fc37 copr_base 109 k zeromq-devel x86_64 4.3.4-5.fc38 fedora 16 k Installing dependencies: Lmod x86_64 8.7.32-1.fc38 updates 261 k MUMPS x86_64 5.5.1-1.fc38 fedora 2.0 M MUMPS-common noarch 5.5.1-1.fc38 fedora 830 k SuperLU x86_64 5.3.0-4.fc38 fedora 183 k adobe-mappings-cmap noarch 20230622-1.fc38 updates 2.1 M adobe-mappings-cmap-deprecated noarch 20230622-1.fc38 updates 113 k adobe-mappings-pdf noarch 20190401-3.fc38 fedora 698 k alsa-lib x86_64 1.2.11-2.fc38 updates 520 k annobin-docs noarch 12.40-1.fc38 updates 87 k annobin-plugin-gcc x86_64 12.40-1.fc38 updates 955 k armadillo x86_64 12.8.1-1.fc38 updates 32 k arpack x86_64 3.8.0-6.fc38 fedora 206 k asmjit x86_64 1:0-20220702.1.gitc5984762.fc38 copr_base 205 k avahi-libs x86_64 0.8-22.fc38 updates 66 k blosc x86_64 1.21.5-2.fc38 updates 59 k byte-buddy noarch 1.12.10-3.fc38 fedora 2.9 M byte-buddy-agent noarch 1.12.10-3.fc38 fedora 65 k cairo x86_64 1.17.8-4.fc38 updates 704 k cairo-gobject x86_64 1.17.8-4.fc38 updates 18 k cdparanoia-libs x86_64 10.2-41.fc38 fedora 54 k ceres-solver x86_64 2.1.0-5.fc38 fedora 720 k cfitsio x86_64 4.2.0-3.fc38 fedora 607 k cgnslib-libs x86_64 4.3.0-7.fc38 fedora 296 k cjson x86_64 1.7.14-7.fc38 fedora 31 k clang15-libs x86_64 15.0.7-5.fc38 updates 21 M clang15-resource-filesystem x86_64 15.0.7-5.fc38 updates 11 k cliquer-libs x86_64 1.22-5.fc38 fedora 38 k cmake x86_64 3.27.7-1.fc38 updates 8.0 M cmake-data noarch 3.27.7-1.fc38 updates 2.2 M cmake-filesystem x86_64 3.27.7-1.fc38 updates 19 k cmake-rpm-macros noarch 3.27.7-1.fc38 updates 18 k codec2 x86_64 1.0.5-2.fc38 fedora 641 k coin-or-Cbc x86_64 2.10.5-12.fc38 fedora 833 k coin-or-Cgl x86_64 0.60.3-9.fc38 fedora 432 k coin-or-Clp x86_64 1.17.6-12.fc38 fedora 938 k coin-or-CoinUtils x86_64 2.11.4-9.fc38 fedora 482 k coin-or-Osi x86_64 0.108.6-8.fc38 fedora 320 k copy-jdk-configs noarch 4.1-2.fc38 fedora 28 k cpp x86_64 13.2.1-7.fc38 updates 11 M cpuinfo x86_64 1:0-20240327.0.gitf42f5eaf.fc38 copr_base 46 k crypto-policies-scripts noarch 20230301-1.gita12f7b2.fc38 fedora 116 k cuda-cccl-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 1.9 M cuda-crt-12-3 x86_64 12.3.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 111 k cuda-cudart-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 223 k cuda-gcc-12 x86_64 12.3.1-1.fc38 copr_base 34 M cuda-nvrtc-12-3 x86_64 12.3.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 24 M cuda-nvvm-12-3 x86_64 12.3.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 26 M cuda-toolkit-12-3-config-common noarch 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 7.7 k cuda-toolkit-12-config-common noarch 12.4.127-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 7.9 k cuda-toolkit-config-common noarch 12.4.127-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 7.9 k cups-libs x86_64 1:2.4.7-11.fc38 updates 268 k cutlass x86_64 3.4.1-20240215.0.cu12_3.fc38 copr_base 181 M dbus x86_64 1:1.14.10-1.fc38 updates 8.0 k dbus-broker x86_64 33-1.fc38 fedora 173 k dbus-common noarch 1:1.14.10-1.fc38 updates 15 k dbus-libs x86_64 1:1.14.10-1.fc38 updates 156 k double-conversion x86_64 3.1.5-8.fc38 fedora 49 k emacs-filesystem noarch 1:29.3-1.fc38 updates 7.2 k expat x86_64 2.6.0-1.fc38 updates 112 k fbgemm x86_64 0.7.0-20240315.0.git0049a2ca.fc38 copr_base 1.4 M fdk-aac-free x86_64 2.0.0-10.fc38 fedora 336 k fftw x86_64 3.3.10-10.fc38 updates 46 k fftw-libs x86_64 3.3.10-10.fc38 updates 8.0 k fftw-libs-double x86_64 3.3.10-10.fc38 updates 1.2 M fftw-libs-long x86_64 3.3.10-10.fc38 updates 505 k fftw-libs-quad x86_64 3.3.10-10.fc38 updates 740 k fftw-libs-single x86_64 3.3.10-10.fc38 updates 1.2 M flatbuffers x86_64 23.3.3-1.fc38 fedora 197 k flexiblas x86_64 3.4.2-1.fc38 updates 25 k flexiblas-netlib x86_64 3.4.2-1.fc38 updates 3.1 M flexiblas-netlib64 x86_64 3.4.2-1.fc38 updates 3.0 M flexiblas-openblas-openmp x86_64 3.4.2-1.fc38 updates 17 k flexiblas-openblas-openmp64 x86_64 3.4.2-1.fc38 updates 17 k fontconfig x86_64 2.14.2-2.fc38 updates 295 k fonts-filesystem noarch 1:2.0.5-11.fc38 fedora 8.1 k foxi x86_64 0-20210526.1.gitc278588e.fc37 copr_base 12 k fp16 x86_64 1:0-20240410.0.git581ac1c7.fc38 copr_base 12 k freetype x86_64 2.13.0-2.fc38 fedora 414 k freexl x86_64 1.0.6-21.fc38 fedora 35 k fribidi x86_64 1.0.12-3.fc38 fedora 89 k game-music-emu x86_64 0.6.3-11.fc38 fedora 157 k gc x86_64 8.2.2-3.fc38 fedora 110 k gcc x86_64 13.2.1-7.fc38 updates 34 M gcc-plugin-annobin x86_64 13.2.1-7.fc38 updates 52 k gd x86_64 2.3.3-10.fc38 fedora 140 k gdal-libs x86_64 3.6.4-2.fc38 updates 8.2 M gdk-pixbuf2 x86_64 2.42.10-2.fc38 fedora 485 k gdk-pixbuf2-modules x86_64 2.42.10-2.fc38 fedora 85 k gecode x86_64 6.2.0-11.fc38 fedora 3.2 M geos x86_64 3.11.1-3.fc38 fedora 994 k gflags x86_64 2.2.2-11.fc38 fedora 93 k giflib x86_64 5.2.2-1.fc38 updates 52 k git-core x86_64 2.44.0-1.fc38 updates 4.5 M git-core-doc noarch 2.44.0-1.fc38 updates 2.9 M gklib x86_64 5.1.1-20230326.0.git8bd6bad7.fc38 copr_base 103 k gl-manpages noarch 1.1-26.20190306.fc38 fedora 1.2 M glib2 x86_64 2.76.6-1.fc38 updates 2.8 M glibc-devel x86_64 2.37-18.fc38 updates 54 k glibc-headers-x86 noarch 2.37-18.fc38 updates 536 k glog x86_64 0.3.5-17.fc38 fedora 68 k gloo x86_64 1:0.5.0-20240302.0.git2565674c.cu12_3.fc38 copr_base 825 k glpk x86_64 5.0-6.fc38 fedora 387 k glx-utils x86_64 8.5.0-1.fc38 fedora 40 k gmp-c++ x86_64 1:6.2.1-4.fc38 fedora 18 k gnutls x86_64 3.8.4-1.fc38 updates 1.1 M google-droid-sans-fonts noarch 20200215-15.fc38 updates 2.7 M google-noto-fonts-common noarch 20230201-2.fc38 updates 16 k google-noto-sans-vf-fonts noarch 20230201-2.fc38 updates 579 k graphene x86_64 1.10.6-5.fc38 fedora 62 k graphite2 x86_64 1.3.14-11.fc38 fedora 95 k graphviz x86_64 7.1.0-3.fc38 updates 5.0 M groff-base x86_64 1.22.4-11.fc38 fedora 1.1 M gsl x86_64 2.7.1-4.fc38 fedora 1.1 M gsm x86_64 1.0.22-2.fc38 fedora 35 k gstreamer1 x86_64 1.22.9-1.fc38 updates 1.4 M gstreamer1-plugins-base x86_64 1.22.9-1.fc38 updates 2.2 M gts x86_64 0.7.6-44.20121130.fc38 fedora 240 k guile22 x86_64 2.2.7-7.fc38 fedora 6.5 M halide x86_64 17.0.1-20240220.0.fc38 copr_base 20 M harfbuzz x86_64 7.1.0-1.fc38 fedora 889 k hdf-libs x86_64 4.2.15-12.fc38 fedora 294 k hdf5 x86_64 1.12.1-11.fc38 fedora 2.2 M highway x86_64 1.1.0-1.fc38 updates 482 k hiredis x86_64 1.0.2-4.fc38 fedora 42 k hwdata noarch 0.380-1.fc38 updates 1.6 M ilbc x86_64 3.0.4-4.fc38 fedora 53 k imath x86_64 3.1.10-1.fc38 updates 97 k infiniband-diags x86_64 44.0-3.fc38 fedora 329 k isl x86_64 0.16.1-17.fc38 fedora 853 k iso-codes noarch 4.13.0-1.fc38 fedora 3.5 M jacop noarch 4.9.0-1.fc38 fedora 1.7 M java-17-openjdk-headless x86_64 1:17.0.9.0.9-3.fc38 updates 44 M javapackages-filesystem noarch 6.1.0-7.fc38 fedora 13 k javapackages-tools noarch 6.1.0-7.fc38 fedora 37 k jbig2dec-libs x86_64 0.19-8.fc38 fedora 73 k jbigkit-libs x86_64 2.1-25.fc38 fedora 53 k json-c x86_64 0.17-1.fc38 updates 43 k jsoncpp x86_64 1.9.5-4.fc38 fedora 97 k kernel-headers x86_64 6.8.3-100.fc38 updates 1.6 M keyutils-libs-devel x86_64 1.6.3-1.fc38 updates 60 k kineto x86_64 0.4.0-20240327.0.git445909a8.cu12_3.fc38 copr_base 296 k kmod-libs x86_64 30-4.fc38 fedora 68 k krb5-devel x86_64 1.21-3.fc38 updates 144 k lame-libs x86_64 3.100-14.fc38 fedora 337 k langpacks-core-font-en noarch 3.0-32.fc38 updates 9.6 k lasi x86_64 1.1.3-10.fc38 fedora 53 k lcms2 x86_64 2.15-1.fc38 fedora 178 k less x86_64 633-1.fc38 updates 175 k leveldb x86_64 1.23-6.fc38 fedora 151 k libGLEW x86_64 2.2.0-4.fc38 fedora 175 k libICE x86_64 1.0.10-10.fc38 fedora 71 k libSM x86_64 1.2.3-12.fc38 fedora 41 k libX11 x86_64 1.8.7-1.fc38 updates 650 k libX11-common noarch 1.8.7-1.fc38 updates 176 k libX11-devel x86_64 1.8.7-1.fc38 updates 1.0 M libX11-xcb x86_64 1.8.7-1.fc38 updates 12 k libXau x86_64 1.0.11-2.fc38 fedora 32 k libXau-devel x86_64 1.0.11-2.fc38 fedora 14 k libXcursor x86_64 1.2.1-3.fc38 fedora 30 k libXext x86_64 1.3.5-2.fc38 fedora 39 k libXfixes x86_64 6.0.0-5.fc38 fedora 19 k libXft x86_64 2.3.8-2.fc38 updates 72 k libXi x86_64 1.8.1-1.fc38 updates 40 k libXpm x86_64 3.5.17-1.fc38 updates 65 k libXrender x86_64 0.9.11-2.fc38 fedora 27 k libXt x86_64 1.2.1-4.fc38 fedora 179 k libXv x86_64 1.0.11-18.fc38 fedora 18 k libXxf86vm x86_64 1.1.5-2.fc38 fedora 18 k libaec x86_64 1.0.6-4.fc38 fedora 42 k libaom x86_64 3.8.2-1.fc38 updates 1.8 M libavcodec-free x86_64 6.0.1-2.fc38 updates 4.0 M libavformat-free x86_64 6.0.1-2.fc38 updates 1.1 M libavif x86_64 0.11.1-7.fc38 fedora 84 k libavutil-free x86_64 6.0.1-2.fc38 updates 343 k libb2 x86_64 0.98.1-8.fc38 fedora 25 k libbluray x86_64 1.3.4-2.fc38 fedora 173 k libcbor x86_64 0.7.0-9.fc38 fedora 56 k libchromaprint x86_64 1.5.1-8.fc38 fedora 39 k libcom_err-devel x86_64 1.46.5-4.fc38 fedora 15 k libcublas-12-3 x86_64 12.3.4.1-2 copr_rezso_CUDA 245 M libcudnn8 x86_64 8.9.7.29-2.cuda12.3 copr_rezso_CUDA 447 M libcufft-12-3 x86_64 11.0.12.1-2 copr_rezso_CUDA 60 M libcurand-12-3 x86_64 10.3.4.107-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 53 M libcusolver-12-3 x86_64 11.5.4.101-2 copr_rezso_CUDA 77 M libcusparse-12-3 x86_64 12.2.0.103-2 copr_rezso_CUDA 108 M libdatrie x86_64 0.2.13-5.fc38 fedora 32 k libdav1d x86_64 1.2.1-1.fc38 updates 618 k libdc1394 x86_64 2.2.6-9.fc38 fedora 130 k libdrm x86_64 2.4.120-1.fc38 updates 157 k libedit x86_64 3.1-45.20221030cvs.fc38 fedora 107 k libevdev x86_64 1.13.1-1.fc38 updates 44 k libfido2 x86_64 1.12.0-3.fc38 fedora 97 k libfontenc x86_64 1.1.6-2.fc38 fedora 32 k libgcrypt x86_64 1.10.2-1.fc38 updates 514 k libgeotiff x86_64 1.7.1-6.fc38 fedora 106 k libgfortran x86_64 13.2.1-7.fc38 updates 910 k libglvnd x86_64 1:1.6.0-2.fc38 fedora 134 k libglvnd-core-devel x86_64 1:1.6.0-2.fc38 fedora 18 k libglvnd-devel x86_64 1:1.6.0-2.fc38 fedora 163 k libglvnd-egl x86_64 1:1.6.0-2.fc38 fedora 36 k libglvnd-gles x86_64 1:1.6.0-2.fc38 fedora 32 k libglvnd-glx x86_64 1:1.6.0-2.fc38 fedora 142 k libglvnd-opengl x86_64 1:1.6.0-2.fc38 fedora 43 k libgpg-error x86_64 1.47-1.fc38 updates 230 k libgs x86_64 10.02.1-2.fc38 updates 3.4 M libgta x86_64 1.2.1-9.fc38 fedora 35 k libgudev x86_64 237-4.fc38 fedora 35 k libharu x86_64 2.4.3-2.fc38 fedora 580 k libibumad x86_64 44.0-3.fc38 fedora 27 k libibverbs x86_64 44.0-3.fc38 fedora 429 k libicu x86_64 72.1-2.fc38 fedora 10 M libijs x86_64 0.35-17.fc38 fedora 29 k libimagequant x86_64 2.17.0-4.fc38 fedora 63 k libinput x86_64 1.23.0-2.fc38 updates 213 k libjpeg-turbo x86_64 2.1.4-2.fc38 fedora 183 k libjxl x86_64 1:0.7.0-6.fc38 fedora 1.0 M libkadm5 x86_64 1.21-3.fc38 updates 78 k libkml x86_64 1.3.0-43.fc38 fedora 355 k libldb x86_64 2.7.2-1.fc38 fedora 180 k liblerc x86_64 4.0.0-3.fc38 fedora 202 k libmodplug x86_64 1:0.8.9.0-16.fc38 fedora 176 k libmpc x86_64 1.3.1-2.fc38 fedora 70 k libnauty x86_64 2.8.6-5.fc38 updates 603 k libnccl x86_64 2.21.5-1+cuda12.4 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 130 M libnl3 x86_64 3.7.0-3.fc38 fedora 345 k libnpp-12-3 x86_64 12.2.3.2-2 copr_rezso_CUDA 96 M libnvjitlink-12-3 x86_64 12.3.101-1 http_developer_download_nvidia_com_compute_cuda_repos_rhel8_x86_64 19 M libogg x86_64 2:1.3.5-5.fc38 fedora 33 k libopenmpt x86_64 0.6.12-1.fc38 updates 637 k libpaper x86_64 1:2.0.8-1.fc38 fedora 26 k libpciaccess x86_64 0.16-8.fc38 fedora 26 k libpng x86_64 2:1.6.37-14.fc38 fedora 120 k libpq x86_64 15.3-1.fc38 updates 215 k libproxy x86_64 0.4.18-6.fc38 fedora 71 k libqhull_r x86_64 1:7.2.1-12.fc38 fedora 167 k libquadmath x86_64 13.2.1-7.fc38 updates 200 k librabbitmq x86_64 0.13.0-1.fc38 fedora 43 k libraw1394 x86_64 2.1.2-17.fc38 fedora 64 k librdmacm x86_64 44.0-3.fc38 fedora 72 k librist x86_64 0.2.7-1.fc38 fedora 77 k librsvg2 x86_64 2.56.4-1.fc38 updates 1.6 M librttopo x86_64 1.1.0-11.fc38 fedora 208 k libseccomp x86_64 2.5.3-4.fc38 fedora 71 k libselinux-devel x86_64 3.5-1.fc38 fedora 151 k libsepol-devel x86_64 3.5-1.fc38 fedora 49 k libsmbclient x86_64 2:4.18.11-1.fc38 updates 76 k libsodium x86_64 1.0.18-11.fc38 fedora 162 k libsodium-devel x86_64 1.0.18-11.fc38 fedora 1.1 M libspatialite x86_64 5.0.1-20.fc38 fedora 3.1 M libstdc++-devel x86_64 13.2.1-7.fc38 updates 2.6 M libswresample-free x86_64 6.0.1-2.fc38 updates 69 k libswscale-free x86_64 6.0.1-2.fc38 updates 191 k libtalloc x86_64 2.4.0-2.fc38 fedora 31 k libtdb x86_64 1.4.8-1.fc38 fedora 51 k libtevent x86_64 0.14.1-1.fc38 fedora 45 k libthai x86_64 0.1.29-4.fc38 fedora 213 k libtheora x86_64 1:1.1.1-33.fc38 fedora 166 k libtiff x86_64 4.4.0-8.fc38 updates 201 k libtool-ltdl x86_64 2.4.7-6.fc38 fedora 37 k libudfread x86_64 1.1.2-5.fc38 fedora 35 k libunwind x86_64 1.6.2-7.fc38 fedora 68 k libunwind-devel x86_64 1.6.2-7.fc38 fedora 88 k liburing x86_64 2.4-2.fc38 updates 38 k libusb1 x86_64 1.0.27-1.fc38 updates 76 k libuv x86_64 1:1.48.0-1.fc38 updates 252 k libuv-static x86_64 1:1.48.0-1.fc38 updates 106 k libva x86_64 2.18.0-1.fc38 fedora 105 k libvdpau x86_64 1.5-3.fc38 fedora 16 k libverto-devel x86_64 0.3.2-5.fc38 fedora 14 k libvisual x86_64 1:0.4.1-1.fc38 fedora 151 k libvmaf x86_64 2.3.0-5.fc38 fedora 180 k libvorbis x86_64 1:1.3.7-7.fc38 fedora 195 k libvpl x86_64 1:2.10.2-1.fc38 updates 174 k libvpx x86_64 1.13.1-1.fc38 updates 1.1 M libwacom x86_64 2.8.0-1.fc38 updates 42 k libwacom-data noarch 2.8.0-1.fc38 updates 191 k libwayland-client x86_64 1.22.0-1.fc38 updates 34 k libwayland-cursor x86_64 1.22.0-1.fc38 updates 19 k libwayland-egl x86_64 1.22.0-1.fc38 updates 13 k libwayland-server x86_64 1.22.0-1.fc38 updates 42 k libwbclient x86_64 2:4.18.11-1.fc38 updates 46 k libwebp x86_64 1.3.2-2.fc38 updates 284 k libxcb x86_64 1.13.1-11.fc38 fedora 231 k libxcb-devel x86_64 1.13.1-11.fc38 fedora 1.4 M libxcrypt-devel x86_64 4.4.36-1.fc38 updates 30 k libxkbcommon x86_64 1.5.0-2.fc38 fedora 140 k libxkbcommon-x11 x86_64 1.5.0-2.fc38 fedora 22 k libxshmfence x86_64 1.3-12.fc38 fedora 12 k libyaml x86_64 0.2.5-9.fc38 fedora 59 k lksctp-tools x86_64 1.0.19-3.fc38 fedora 92 k llvm-libs x86_64 16.0.6-3.fc38 updates 27 M llvm15-libs x86_64 15.0.7-4.fc38 fedora 25 M lmdb x86_64 0.9.32-1.fc38 updates 32 k lmdb-libs x86_64 0.9.32-1.fc38 updates 61 k lpcnetfreedv x86_64 0.2-13.fc38 fedora 7.3 M lua x86_64 5.4.4-9.fc38 fedora 190 k lua-filesystem x86_64 1.8.0-8.fc38 fedora 34 k lua-json noarch 1.3.4-3.fc38 fedora 30 k lua-lpeg x86_64 1.0.2-10.fc38 fedora 67 k lua-posix x86_64 35.1-5.fc38 fedora 138 k lua-term x86_64 0.07-17.fc38 fedora 15 k magma x86_64 2.8.0-20240328.0.cu12_3.fc38 copr_base 119 M make x86_64 1:4.4.1-1.fc38 updates 588 k mariadb-connector-c x86_64 3.3.8-1.fc38 updates 214 k mariadb-connector-c-config noarch 3.3.8-1.fc38 updates 8.6 k mbedtls x86_64 2.28.7-1.fc38 updates 402 k mesa-filesystem x86_64 23.1.9-1.fc38 updates 17 k mesa-libEGL x86_64 23.1.9-1.fc38 updates 131 k mesa-libGL x86_64 23.1.9-1.fc38 updates 173 k mesa-libGLU x86_64 9.0.3-1.fc38 updates 160 k mesa-libgbm x86_64 23.1.9-1.fc38 updates 44 k mesa-libglapi x86_64 23.1.9-1.fc38 updates 53 k metis x86_64 5.2.1-20230403.0.gite0f1b88b.fc38 copr_base 176 k miniz x86_64 3.0.2-2.fc38 fedora 65 k minizip-ng x86_64 3.0.7-4.fc38 updates 69 k mkfontscale x86_64 1.2.2-3.fc38 fedora 32 k mockito noarch 3.12.4-6.fc38 fedora 583 k mp x86_64 3.1.0-41.20200303git7fd4828.fc38 fedora 978 k mpdecimal x86_64 2.5.1-6.fc38 fedora 89 k mpg123-libs x86_64 1.31.3-1.fc38 fedora 340 k mtdev x86_64 1.1.6-5.fc38 fedora 21 k ncurses x86_64 6.4-7.20230520.fc38.1 updates 415 k netcdf x86_64 4.9.0-5.fc38 fedora 833 k netpbm x86_64 11.02.00-1.fc38 fedora 185 k nettle x86_64 3.8-3.fc38 fedora 412 k nnpack x86_64 0-20230201.0.git70a77f48.fc38 copr_base 56 k nspr x86_64 4.35.0-17.fc38 updates 137 k nss x86_64 3.99.0-1.fc38 updates 703 k nss-softokn x86_64 3.99.0-1.fc38 updates 419 k nss-softokn-freebl x86_64 3.99.0-1.fc38 updates 381 k nss-sysinit x86_64 3.99.0-1.fc38 updates 18 k nss-util x86_64 3.99.0-1.fc38 updates 87 k numactl-libs x86_64 2.0.16-2.fc38 fedora 31 k objectweb-asm noarch 9.3-5.fc38 fedora 355 k objenesis noarch 3.3-2.fc38 fedora 116 k ocl-icd x86_64 2.3.2-1.fc38 updates 66 k ogdi x86_64 4.1.0-10.fc38 fedora 244 k onnx-libs x86_64 1.17.0-20240404.0.git4128a090.fc38 copr_base 867 k onnx-optimizer x86_64 0.3.19-20240303.0.gitb3a46118.fc38 copr_base 199 k openblas x86_64 0.3.21-4.fc38 fedora 35 k openblas-openmp64 x86_64 0.3.21-4.fc38 fedora 4.9 M openblas-openmp64_ x86_64 0.3.21-4.fc38 fedora 4.9 M openblas-serial x86_64 0.3.21-4.fc38 fedora 4.9 M openblas-serial64 x86_64 0.3.21-4.fc38 fedora 4.8 M openblas-serial64_ x86_64 0.3.21-4.fc38 fedora 4.8 M openblas-threads x86_64 0.3.21-4.fc38 fedora 5.1 M openblas-threads64 x86_64 0.3.21-4.fc38 fedora 4.9 M openblas-threads64_ x86_64 0.3.21-4.fc38 fedora 4.9 M opencl-headers noarch 3.0-18.20231003git9ce9a72.fc38 updates 87 k opencore-amr x86_64 0.1.6-3.fc38 fedora 177 k opencv x86_64 4.9.0-20231227.1.cu12_3.fc38 copr_base 4.4 M opencv-contrib x86_64 4.9.0-20231227.1.cu12_3.fc38 copr_base 5.7 M opencv-core x86_64 4.9.0-20231227.1.cu12_3.fc38 copr_base 10 M opencv-cuda x86_64 4.9.0-20231227.1.cu12_3.fc38 copr_base 37 M opencv-static x86_64 4.9.0-20231227.1.cu12_3.fc38 copr_base 425 k openexr-libs x86_64 3.1.10-1.fc38 updates 1.1 M openjpeg2 x86_64 2.5.2-1.fc38 updates 178 k openpgm x86_64 5.2.122-31.fc38 fedora 176 k openpgm-devel x86_64 5.2.122-31.fc38 fedora 67 k openslide x86_64 3.4.1-23.fc38 fedora 106 k openssh x86_64 9.0p1-19.fc38 updates 435 k openssh-clients x86_64 9.0p1-19.fc38 updates 701 k opentest4j noarch 1.2.0-12.fc38 fedora 24 k opus x86_64 1.3.1-12.fc38 fedora 206 k orc x86_64 0.4.33-2.fc38 fedora 202 k pango x86_64 1.50.14-1.fc38 fedora 342 k pcre x86_64 8.45-1.fc38.3 fedora 201 k pcre2-devel x86_64 10.42-1.fc38.1 fedora 506 k pcre2-utf16 x86_64 10.42-1.fc38.1 fedora 214 k pcre2-utf32 x86_64 10.42-1.fc38.1 fedora 201 k perl-AutoLoader noarch 5.74-498.fc38 updates 22 k perl-B x86_64 1.83-498.fc38 updates 182 k perl-Carp noarch 1.52-490.fc38 fedora 29 k perl-Class-Struct noarch 0.66-498.fc38 updates 23 k perl-Data-Dumper x86_64 2.184-491.fc38 fedora 56 k perl-Digest noarch 1.20-490.fc38 fedora 25 k perl-Digest-MD5 x86_64 2.58-490.fc38 fedora 36 k perl-DynaLoader x86_64 1.52-498.fc38 updates 27 k perl-Encode x86_64 4:3.19-493.fc38 fedora 1.7 M perl-Errno x86_64 1.36-498.fc38 updates 15 k perl-Error noarch 1:0.17029-11.fc38 fedora 40 k perl-Exporter noarch 5.77-490.fc38 fedora 31 k perl-Fcntl x86_64 1.15-498.fc38 updates 21 k perl-File-Basename noarch 2.85-498.fc38 updates 18 k perl-File-Find noarch 1.40-498.fc38 updates 26 k perl-File-Path noarch 2.18-490.fc38 fedora 35 k perl-File-Temp noarch 1:0.231.100-490.fc38 fedora 59 k perl-File-stat noarch 1.12-498.fc38 updates 18 k perl-FileHandle noarch 2.03-498.fc38 updates 16 k perl-Getopt-Long noarch 1:2.54-2.fc38 fedora 60 k perl-Getopt-Std noarch 1.13-498.fc38 updates 16 k perl-Git noarch 2.44.0-1.fc38 updates 40 k perl-HTTP-Tiny noarch 0.086-2.fc38 updates 55 k perl-IO x86_64 1.50-498.fc38 updates 92 k perl-IO-Socket-IP noarch 0.41-492.fc38 fedora 41 k perl-IO-Socket-SSL noarch 2.081-1.fc38 fedora 227 k perl-IPC-Open3 noarch 1.22-498.fc38 updates 23 k perl-MIME-Base64 x86_64 3.16-490.fc38 fedora 30 k perl-Mozilla-CA noarch 20221114-2.fc38 fedora 12 k perl-Net-SSLeay x86_64 1.92-5.fc38 fedora 361 k perl-POSIX x86_64 2.03-498.fc38 updates 98 k perl-PathTools x86_64 3.84-490.fc38 fedora 87 k perl-Pod-Escapes noarch 1:1.07-490.fc38 fedora 20 k perl-Pod-Perldoc noarch 3.28.01-491.fc38 fedora 86 k perl-Pod-Simple noarch 1:3.43-491.fc38 fedora 219 k perl-Pod-Usage noarch 4:2.03-4.fc38 fedora 40 k perl-Scalar-List-Utils x86_64 5:1.63-490.fc38 fedora 72 k perl-SelectSaver noarch 1.02-498.fc38 updates 12 k perl-Socket x86_64 4:2.036-2.fc38 fedora 55 k perl-Storable x86_64 1:3.26-490.fc38 fedora 97 k perl-Symbol noarch 1.09-498.fc38 updates 15 k perl-Term-ANSIColor noarch 5.01-491.fc38 fedora 47 k perl-Term-Cap noarch 1.18-1.fc38 fedora 22 k perl-TermReadKey x86_64 2.38-16.fc38 fedora 35 k perl-Text-ParseWords noarch 3.31-490.fc38 fedora 16 k perl-Text-Tabs+Wrap noarch 2023.0511-1.fc38 updates 22 k perl-Time-Local noarch 2:1.300-490.fc38 fedora 33 k perl-URI noarch 5.17-2.fc38 fedora 120 k perl-base noarch 2.27-498.fc38 updates 17 k perl-constant noarch 1.33-491.fc38 fedora 23 k perl-if noarch 0.61.000-498.fc38 updates 15 k perl-interpreter x86_64 4:5.36.3-498.fc38 updates 73 k perl-lib x86_64 0.65-498.fc38 updates 15 k perl-libnet noarch 3.15-1.fc38 fedora 128 k perl-libs x86_64 4:5.36.3-498.fc38 updates 2.2 M perl-locale noarch 1.10-498.fc38 updates 14 k perl-mro x86_64 1.26-498.fc38 updates 29 k perl-overload noarch 1.35-498.fc38 updates 46 k perl-overloading noarch 0.02-498.fc38 updates 13 k perl-parent noarch 1:0.241-1.fc38 fedora 15 k perl-podlators noarch 1:5.01-2.fc38 fedora 125 k perl-vars noarch 1.05-498.fc38 updates 14 k pixman x86_64 0.42.2-1.fc38 fedora 285 k poppler x86_64 23.02.0-3.fc38 updates 1.2 M poppler-data noarch 0.4.11-4.fc38 fedora 2.0 M poppler-glib x86_64 23.02.0-3.fc38 updates 175 k procps-ng x86_64 3.3.17-11.fc38 updates 338 k proj x86_64 9.1.1-1.fc38 fedora 1.4 M proj-data noarch 9.1.1-1.fc38 fedora 1.2 M protobuf x86_64 3.19.6-2.fc38 fedora 1.0 M protobuf-compat x86_64 3.21.9-2.fc38 copr_base 1.1 M pthreadpool x86_64 1:0.1-20240121.0.git178e3e06.fc38 copr_base 44 k pugixml x86_64 1.13-2.fc38 fedora 100 k pyproject-rpm-macros noarch 1.12.0-1.fc38 updates 41 k python-pip-wheel noarch 22.3.1-3.fc38 updates 1.4 M python-rpm-macros noarch 3.11-10.fc38 fedora 20 k python-setuptools-wheel noarch 65.5.1-2.fc38 fedora 715 k python3 x86_64 3.11.8-2.fc38 updates 28 k python3-libs x86_64 3.11.8-2.fc38 updates 9.6 M python3-packaging noarch 23.0-1.fc38 fedora 106 k python3-rpm-generators noarch 14-4.fc38 updates 30 k python3-rpm-macros noarch 3.11-10.fc38 fedora 15 k qnnpack x86_64 0-20190828.2.git7d2a4e99.fc38 copr_base 50 k qt-settings noarch 38.3-1.fc38 updates 9.2 k qt5-qtbase x86_64 5.15.12-5.fc38 updates 3.6 M qt5-qtbase-common noarch 5.15.12-5.fc38 updates 12 k qt5-qtbase-gui x86_64 5.15.12-5.fc38 updates 6.4 M rav1e-libs x86_64 0.7.1-1.fc38 updates 1.0 M rhash x86_64 1.4.3-2.fc38 fedora 194 k rocksdb x86_64 7.8.3-1.fc38 fedora 2.8 M samba-client-libs x86_64 2:4.18.11-1.fc38 updates 5.3 M samba-common noarch 2:4.18.11-1.fc38 updates 150 k samba-common-libs x86_64 2:4.18.11-1.fc38 updates 108 k scotch x86_64 6.1.2-3.fc37 fedora 397 k shared-mime-info x86_64 2.2-3.fc38 fedora 381 k sleef x86_64 3.6-20240320.0.git60e76d2b.fc38 copr_base 903 k snappy x86_64 1.1.9-7.fc38 fedora 36 k soxr x86_64 0.1.3-13.fc38 fedora 84 k speex x86_64 1.2.0-13.fc38 fedora 67 k srt-libs x86_64 1.5.2-1.fc38 updates 370 k suitesparse x86_64 5.13.0-2.fc38 fedora 1.1 M svt-av1-libs x86_64 1.4.1-2.fc38 fedora 2.0 M systemd x86_64 253.17-1.fc38 updates 4.5 M systemd-pam x86_64 253.17-1.fc38 updates 337 k systemd-rpm-macros noarch 253.17-1.fc38 updates 25 k tbb x86_64 2020.3-16.fc38 fedora 169 k tcl x86_64 1:8.6.12-4.fc38 fedora 1.1 M tensorpipe x86_64 0-20220513.1.gitbb1473a4.fc37 copr_base 802 k twolame-libs x86_64 0.4.0-2.fc38 fedora 69 k tzdata-java noarch 2024a-1.fc38 updates 207 k unixODBC x86_64 2.3.11-2.fc38 fedora 483 k uriparser x86_64 0.9.7-2.fc38 fedora 60 k urw-base35-bookman-fonts noarch 20200910-16.fc38 fedora 848 k urw-base35-c059-fonts noarch 20200910-16.fc38 fedora 875 k urw-base35-d050000l-fonts noarch 20200910-16.fc38 fedora 76 k urw-base35-fonts noarch 20200910-16.fc38 fedora 11 k urw-base35-fonts-common noarch 20200910-16.fc38 fedora 21 k urw-base35-gothic-fonts noarch 20200910-16.fc38 fedora 643 k urw-base35-nimbus-mono-ps-fonts noarch 20200910-16.fc38 fedora 796 k urw-base35-nimbus-roman-fonts noarch 20200910-16.fc38 fedora 857 k urw-base35-nimbus-sans-fonts noarch 20200910-16.fc38 fedora 1.3 M urw-base35-p052-fonts noarch 20200910-16.fc38 fedora 974 k urw-base35-standard-symbols-ps-fonts noarch 20200910-16.fc38 fedora 42 k urw-base35-z003-fonts noarch 20200910-16.fc38 fedora 276 k vapoursynth-libs x86_64 58-4.fc38 fedora 544 k vim-filesystem noarch 2:9.1.264-1.fc38 updates 17 k vo-amrwbenc x86_64 0.1.3-18.fc38 fedora 80 k vtk x86_64 9.2.5-2.fc38 fedora 24 M xapian-core-libs x86_64 1.4.23-1.fc38 updates 771 k xcb-util x86_64 0.4.1-2.fc38 fedora 19 k xcb-util-image x86_64 0.4.1-2.fc38 fedora 19 k xcb-util-keysyms x86_64 0.4.1-2.fc38 fedora 14 k xcb-util-renderutil x86_64 0.3.10-2.fc38 fedora 17 k xcb-util-wm x86_64 0.4.2-2.fc38 fedora 31 k xerces-c x86_64 3.2.5-1.fc38 updates 969 k xkeyboard-config noarch 2.38-1.fc38 fedora 963 k xml-common noarch 0.6.3-60.fc38 fedora 31 k xorg-x11-fonts-ISO8859-1-100dpi noarch 7.5-35.fc38 fedora 1.1 M xorg-x11-proto-devel noarch 2022.2-3.fc38 fedora 299 k xvidcore x86_64 1.3.7-9.fc38 fedora 268 k zeromq x86_64 4.3.4-5.fc38 fedora 459 k zimg x86_64 3.0.5-1.fc38 updates 290 k zlib-devel x86_64 1.2.13-3.fc38 fedora 45 k zvbi x86_64 0.2.35-19.fc38 fedora 419 k Transaction Summary ==================================================================================================================================================================== Install 590 Packages Total size: 2.5 G Total download size: 2.3 G Installed size: 8.1 G Downloading Packages: [SKIPPED] protobuf-compat-3.21.9-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] protobuf-compat-compiler-3.21.9-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] protobuf-compat-devel-3.21.9-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] adobe-mappings-pdf-20190401-3.fc38.noarch.rpm: Already downloaded [SKIPPED] crypto-policies-scripts-20230301-1.gita12f7b2.fc38.noarch.rpm: Already downloaded [SKIPPED] doxygen-1.9.6-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] fonts-filesystem-2.0.5-11.fc38.noarch.rpm: Already downloaded [SKIPPED] freetype-2.13.0-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] fribidi-1.0.12-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] gc-8.2.2-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] gd-2.3.3-10.fc38.x86_64.rpm: Already downloaded [SKIPPED] gdk-pixbuf2-2.42.10-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] graphite2-1.3.14-11.fc38.x86_64.rpm: Already downloaded [SKIPPED] groff-base-1.22.4-11.fc38.x86_64.rpm: Already downloaded [SKIPPED] gts-0.7.6-44.20121130.fc38.x86_64.rpm: Already downloaded [SKIPPED] guile22-2.2.7-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] harfbuzz-7.1.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] jbig2dec-libs-0.19-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] jbigkit-libs-2.1-25.fc38.x86_64.rpm: Already downloaded [SKIPPED] jsoncpp-1.9.5-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] lasi-1.1.3-10.fc38.x86_64.rpm: Already downloaded [SKIPPED] lcms2-2.15-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libICE-1.0.10-10.fc38.x86_64.rpm: Already downloaded [SKIPPED] libSM-1.2.3-12.fc38.x86_64.rpm: Already downloaded [SKIPPED] libXau-1.0.11-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libXext-1.3.5-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libXrender-0.9.11-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libXt-1.2.1-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libavif-0.11.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libb2-0.98.1-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] libcbor-0.7.0-9.fc38.x86_64.rpm: Already downloaded [SKIPPED] libdatrie-0.2.13-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] libedit-3.1-45.20221030cvs.fc38.x86_64.rpm: Already downloaded [SKIPPED] libfido2-1.12.0-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] libfontenc-1.1.6-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libijs-0.35-17.fc38.x86_64.rpm: Already downloaded [SKIPPED] libimagequant-2.17.0-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libjpeg-turbo-2.1.4-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libjxl-0.7.0-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] liblerc-4.0.0-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] libmpc-1.3.1-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libpaper-2.0.8-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libpng-1.6.37-14.fc38.x86_64.rpm: Already downloaded [SKIPPED] libthai-0.1.29-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] libtool-ltdl-2.4.7-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] libvmaf-2.3.0-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] libxcb-1.13.1-11.fc38.x86_64.rpm: Already downloaded [SKIPPED] llvm15-libs-15.0.7-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] mkfontscale-1.2.2-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] mpdecimal-2.5.1-6.fc38.x86_64.rpm: Already downloaded [SKIPPED] netpbm-11.02.00-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] nettle-3.8-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] pango-1.50.14-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Carp-1.52-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Data-Dumper-2.184-491.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Digest-1.20-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Digest-MD5-2.58-490.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Encode-3.19-493.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Error-0.17029-11.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Exporter-5.77-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-File-Path-2.18-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-File-Temp-0.231.100-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Getopt-Long-2.54-2.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-IO-Socket-IP-0.41-492.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-IO-Socket-SSL-2.081-1.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-MIME-Base64-3.16-490.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Mozilla-CA-20221114-2.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Net-SSLeay-1.92-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-PathTools-3.84-490.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Pod-Escapes-1.07-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Pod-Perldoc-3.28.01-491.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Pod-Simple-3.43-491.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Pod-Usage-2.03-4.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Scalar-List-Utils-1.63-490.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Socket-2.036-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Storable-3.26-490.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Term-ANSIColor-5.01-491.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Term-Cap-1.18-1.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-TermReadKey-2.38-16.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Text-ParseWords-3.31-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Time-Local-1.300-490.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-URI-5.17-2.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-constant-1.33-491.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-libnet-3.15-1.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-parent-0.241-1.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-podlators-5.01-2.fc38.noarch.rpm: Already downloaded [SKIPPED] pixman-0.42.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] poppler-data-0.4.11-4.fc38.noarch.rpm: Already downloaded [SKIPPED] pybind11-devel-2.10.3-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] python-rpm-macros-3.11-10.fc38.noarch.rpm: Already downloaded [SKIPPED] python-setuptools-wheel-65.5.1-2.fc38.noarch.rpm: Already downloaded [SKIPPED] python3-packaging-23.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] python3-rpm-macros-3.11-10.fc38.noarch.rpm: Already downloaded [SKIPPED] python3-setuptools-65.5.1-2.fc38.noarch.rpm: Already downloaded [SKIPPED] python3-six-1.16.0-9.fc38.noarch.rpm: Already downloaded [SKIPPED] rhash-1.4.3-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] shared-mime-info-2.2-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] svt-av1-libs-1.4.1-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] urw-base35-bookman-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-c059-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-d050000l-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-fonts-common-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-gothic-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-nimbus-mono-ps-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-nimbus-roman-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-nimbus-sans-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-p052-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-standard-symbols-ps-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] urw-base35-z003-fonts-20200910-16.fc38.noarch.rpm: Already downloaded [SKIPPED] xml-common-0.6.3-60.fc38.noarch.rpm: Already downloaded [SKIPPED] xorg-x11-fonts-ISO8859-1-100dpi-7.5-35.fc38.noarch.rpm: Already downloaded [SKIPPED] zlib-devel-1.2.13-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] adobe-mappings-cmap-20230622-1.fc38.noarch.rpm: Already downloaded [SKIPPED] adobe-mappings-cmap-deprecated-20230622-1.fc38.noarch.rpm: Already downloaded [SKIPPED] annobin-docs-12.40-1.fc38.noarch.rpm: Already downloaded [SKIPPED] annobin-plugin-gcc-12.40-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] avahi-libs-0.8-22.fc38.x86_64.rpm: Already downloaded [SKIPPED] cairo-1.17.8-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] cairo-gobject-1.17.8-4.fc38.x86_64.rpm: Already downloaded [SKIPPED] clang15-libs-15.0.7-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] clang15-resource-filesystem-15.0.7-5.fc38.x86_64.rpm: Already downloaded [SKIPPED] cmake-3.27.7-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] cmake-data-3.27.7-1.fc38.noarch.rpm: Already downloaded [SKIPPED] cmake-filesystem-3.27.7-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] cmake-rpm-macros-3.27.7-1.fc38.noarch.rpm: Already downloaded [SKIPPED] cpp-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] cups-libs-2.4.7-11.fc38.x86_64.rpm: Already downloaded [SKIPPED] dbus-libs-1.14.10-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] emacs-filesystem-29.3-1.fc38.noarch.rpm: Already downloaded [SKIPPED] expat-2.6.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] fontconfig-2.14.2-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] gcc-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] gcc-c++-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] gcc-plugin-annobin-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] git-2.44.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] git-core-2.44.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] git-core-doc-2.44.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] glib2-2.76.6-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] glibc-devel-2.37-18.fc38.x86_64.rpm: Already downloaded [SKIPPED] glibc-headers-x86-2.37-18.fc38.noarch.rpm: Already downloaded [SKIPPED] gnutls-3.8.4-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] google-droid-sans-fonts-20200215-15.fc38.noarch.rpm: Already downloaded [SKIPPED] google-noto-fonts-common-20230201-2.fc38.noarch.rpm: Already downloaded [SKIPPED] google-noto-sans-vf-fonts-20230201-2.fc38.noarch.rpm: Already downloaded [SKIPPED] graphviz-7.1.0-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] highway-1.1.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] kernel-headers-6.8.3-100.fc38.x86_64.rpm: Already downloaded [SKIPPED] langpacks-core-font-en-3.0-32.fc38.noarch.rpm: Already downloaded [SKIPPED] less-633-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libX11-1.8.7-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libX11-common-1.8.7-1.fc38.noarch.rpm: Already downloaded [SKIPPED] libXft-2.3.8-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libXpm-3.5.17-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libaom-3.8.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libdav1d-1.2.1-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libgs-10.02.1-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] librsvg2-2.56.4-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libstdc++-devel-13.2.1-7.fc38.x86_64.rpm: Already downloaded [SKIPPED] libtiff-4.4.0-8.fc38.x86_64.rpm: Already downloaded [SKIPPED] libuv-1.48.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] libwebp-1.3.2-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] libxcrypt-devel-4.4.36-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] make-4.4.1-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] ncurses-6.4-7.20230520.fc38.1.x86_64.rpm: Already downloaded [SKIPPED] nspr-4.35.0-17.fc38.x86_64.rpm: Already downloaded [SKIPPED] nss-3.99.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] nss-softokn-3.99.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] nss-softokn-freebl-3.99.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] nss-sysinit-3.99.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] nss-util-3.99.0-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] openjpeg2-2.5.2-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] openssh-9.0p1-19.fc38.x86_64.rpm: Already downloaded [SKIPPED] openssh-clients-9.0p1-19.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-AutoLoader-5.74-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-B-1.83-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Class-Struct-0.66-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-DynaLoader-1.52-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Errno-1.36-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-Fcntl-1.15-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-File-Basename-2.85-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-File-Find-1.40-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-File-stat-1.12-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-FileHandle-2.03-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Getopt-Std-1.13-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Git-2.44.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-HTTP-Tiny-0.086-2.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-IO-1.50-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-IPC-Open3-1.22-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-POSIX-2.03-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-SelectSaver-1.02-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Symbol-1.09-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-Text-Tabs+Wrap-2023.0511-1.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-base-2.27-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-if-0.61.000-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-interpreter-5.36.3-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-lib-0.65-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-libs-5.36.3-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-locale-1.10-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-mro-1.26-498.fc38.x86_64.rpm: Already downloaded [SKIPPED] perl-overload-1.35-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-overloading-0.02-498.fc38.noarch.rpm: Already downloaded [SKIPPED] perl-vars-1.05-498.fc38.noarch.rpm: Already downloaded [SKIPPED] poppler-23.02.0-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] poppler-glib-23.02.0-3.fc38.x86_64.rpm: Already downloaded [SKIPPED] pyproject-rpm-macros-1.12.0-1.fc38.noarch.rpm: Already downloaded [SKIPPED] python-pip-wheel-22.3.1-3.fc38.noarch.rpm: Already downloaded [SKIPPED] python3-3.11.8-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] python3-devel-3.11.8-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] python3-libs-3.11.8-2.fc38.x86_64.rpm: Already downloaded [SKIPPED] python3-rpm-generators-14-4.fc38.noarch.rpm: Already downloaded [SKIPPED] rav1e-libs-0.7.1-1.fc38.x86_64.rpm: Already downloaded [SKIPPED] vim-filesystem-9.1.264-1.fc38.noarch.rpm: Already downloaded [SKIPPED] xapian-core-libs-1.4.23-1.fc38.x86_64.rpm: Already downloaded (215/590): cpuinfo-0-20240327.0.gitf42f5eaf.fc3 1.0 MB/s | 46 kB 00:00 (216/590): asmjit-devel-0-20220702.1.gitc598476 4.4 MB/s | 230 kB 00:00 (217/590): cpuinfo-devel-0-20240327.0.gitf42f5e 1.5 MB/s | 24 kB 00:00 (218/590): asmjit-0-20220702.1.gitc5984762.fc38 2.5 MB/s | 205 kB 00:00 (219/590): cuda-gcc-12-c++-12.3.1-1.fc38.x86_64 39 MB/s | 15 MB 00:00 (220/590): cutlass-devel-3.4.1-20240215.0.cu12_ 18 MB/s | 774 kB 00:00 (221/590): fbgemm-0.7.0-20240315.0.git0049a2ca. 19 MB/s | 1.4 MB 00:00 (222/590): fbgemm-devel-0.7.0-20240315.0.git004 2.4 MB/s | 63 kB 00:00 (223/590): foxi-0-20210526.1.gitc278588e.fc37.x 461 kB/s | 12 kB 00:00 (224/590): foxi-devel-0-20210526.1.gitc278588e. 1.1 MB/s | 24 kB 00:00 (225/590): fp16-0-20240410.0.git581ac1c7.fc38.x 256 kB/s | 12 kB 00:00 (226/590): fp16-devel-0-20240410.0.git581ac1c7. 555 kB/s | 13 kB 00:00 (227/590): fxdiv-devel-0-20201208.1.git63058eff 443 kB/s | 12 kB 00:00 (228/590): cuda-gcc-12-12.3.1-1.fc38.x86_64.rpm 46 MB/s | 34 MB 00:00 (229/590): gklib-5.1.1-20230326.0.git8bd6bad7.f 1.9 MB/s | 103 kB 00:00 (230/590): gemmlowp-devel-0-20231104.0.git16e86 1.1 MB/s | 157 kB 00:00 (231/590): gloo-0.5.0-20240302.0.git2565674c.cu 18 MB/s | 825 kB 00:00 (232/590): gloo-devel-0.5.0-20240302.0.git25656 2.7 MB/s | 74 kB 00:00 (233/590): kineto-0.4.0-20240327.0.git445909a8. 8.5 MB/s | 296 kB 00:00 (234/590): kineto-devel-0.4.0-20240327.0.git445 1.3 MB/s | 23 kB 00:00 (235/590): halide-17.0.1-20240220.0.fc38.x86_64 36 MB/s | 20 MB 00:00 (236/590): magma-devel-2.8.0-20240328.0.cu12_3. 12 MB/s | 976 kB 00:00 (237/590): metis-5.2.1-20230403.0.gite0f1b88b.f 3.3 MB/s | 176 kB 00:00 (238/590): neon2sse-devel-0-20230131.0.git097a5 2.3 MB/s | 85 kB 00:00 (239/590): nnpack-0-20230201.0.git70a77f48.fc38 1.4 MB/s | 56 kB 00:00 (240/590): nnpack-devel-0-20230201.0.git70a77f4 631 kB/s | 16 kB 00:00 (241/590): onnx-devel-1.17.0-20240404.0.git4128 7.5 MB/s | 129 kB 00:00 (242/590): onnx-libs-1.17.0-20240404.0.git4128a 41 MB/s | 867 kB 00:00 (243/590): onnx-optimizer-0.3.19-20240303.0.git 5.2 MB/s | 199 kB 00:00 (244/590): onnx-optimizer-devel-0.3.19-20240303 3.7 MB/s | 50 kB 00:00 (245/590): opencv-4.9.0-20231227.1.cu12_3.fc38. 49 MB/s | 4.4 MB 00:00 (246/590): opencv-contrib-4.9.0-20231227.1.cu12 54 MB/s | 5.7 MB 00:00 (247/590): opencv-core-4.9.0-20231227.1.cu12_3. 39 MB/s | 10 MB 00:00 (248/590): magma-2.8.0-20240328.0.cu12_3.fc38.x 63 MB/s | 119 MB 00:01 (249/590): opencv-cuda-4.9.0-20231227.1.cu12_3. 54 MB/s | 37 MB 00:00 (250/590): cutlass-3.4.1-20240215.0.cu12_3.fc38 60 MB/s | 181 MB 00:03 (251/590): opencv-devel-4.9.0-20231227.1.cu12_3 5.2 MB/s | 1.3 MB 00:00 (252/590): opencv-static-4.9.0-20231227.1.cu12_ 2.3 MB/s | 425 kB 00:00 (253/590): psimd-devel-0-20200517.2.git072586a7 749 kB/s | 13 kB 00:00 (254/590): peachpy-python3-0-20221113.1.git349e 26 MB/s | 700 kB 00:00 (255/590): pthreadpool-devel-0.1-20240121.0.git 1.1 MB/s | 15 kB 00:00 (256/590): qnnpack-0-20190828.2.git7d2a4e99.fc3 4.6 MB/s | 50 kB 00:00 (257/590): qnnpack-devel-0-20190828.2.git7d2a4e 1.0 MB/s | 12 kB 00:00 (258/590): sleef-devel-3.6-20240320.0.git60e76d 2.2 MB/s | 28 kB 00:00 (259/590): sleef-3.6-20240320.0.git60e76d2b.fc3 25 MB/s | 903 kB 00:00 (260/590): tensorpipe-0-20220513.1.gitbb1473a4. 45 MB/s | 802 kB 00:00 (261/590): pthreadpool-0.1-20240121.0.git178e3e 571 kB/s | 44 kB 00:00 (262/590): tensorpipe-devel-0-20220513.1.gitbb1 10 MB/s | 109 kB 00:00 (263/590): libcublas-devel-12-3-12.3.4.1-2.x86_ 3.1 MB/s | 88 kB 00:00 (264/590): libcudnn8-devel-8.9.7.29-2.cuda12.3. 2.3 MB/s | 34 kB 00:00 (265/590): libcufft-12-3-11.0.12.1-2.x86_64.rpm 75 MB/s | 60 MB 00:00 (266/590): libcufft-devel-12-3-11.0.12.1-2.x86_ 589 kB/s | 34 kB 00:00 (267/590): libcusolver-12-3-11.5.4.101-2.x86_64 57 MB/s | 77 MB 00:01 (268/590): libcusolver-devel-12-3-11.5.4.101-2. 1.1 MB/s | 60 kB 00:00 (269/590): libcublas-12-3-12.3.4.1-2.x86_64.rpm 74 MB/s | 245 MB 00:03 (270/590): libcusparse-12-3-12.2.0.103-2.x86_64 73 MB/s | 108 MB 00:01 (271/590): libcusparse-devel-12-3-12.2.0.103-2. 61 MB/s | 108 MB 00:01 (272/590): cuda-cccl-12-3-12.3.101-1.x86_64.rpm 97 MB/s | 1.9 MB 00:00 (273/590): cuda-crt-12-3-12.3.107-1.x86_64.rpm 21 MB/s | 111 kB 00:00 (274/590): cuda-cudart-12-3-12.3.101-1.x86_64.r 50 MB/s | 223 kB 00:00 (275/590): cuda-cudart-devel-12-3-12.3.101-1.x8 140 MB/s | 2.0 MB 00:00 (276/590): libnpp-12-3-12.2.3.2-2.x86_64.rpm 62 MB/s | 96 MB 00:01 (277/590): cuda-driver-devel-12-3-12.3.101-1.x8 5.0 MB/s | 42 kB 00:00 (278/590): cuda-cupti-12-3-12.3.101-1.x86_64.rp 128 MB/s | 29 MB 00:00 (279/590): cuda-nvml-devel-12-3-12.3.101-1.x86_ 22 MB/s | 121 kB 00:00 (280/590): cuda-nvrtc-12-3-12.3.107-1.x86_64.rp 315 MB/s | 24 MB 00:00 (281/590): cuda-nvrtc-devel-12-3-12.3.107-1.x86 253 MB/s | 22 MB 00:00 (282/590): cuda-nvtx-12-3-12.3.101-1.x86_64.rpm 21 MB/s | 88 kB 00:00 (283/590): libcudnn8-8.9.7.29-2.cuda12.3.x86_64 75 MB/s | 447 MB 00:05 (284/590): cuda-nvcc-12-3-12.3.107-1.x86_64.rpm 92 MB/s | 64 MB 00:00 (285/590): cuda-profiler-api-12-3-12.3.101-1.x8 406 kB/s | 26 kB 00:00 (286/590): cuda-toolkit-12-config-common-12.4.1 2.9 MB/s | 7.9 kB 00:00 (287/590): cuda-toolkit-12-3-config-common-12.3 1.4 MB/s | 7.7 kB 00:00 (288/590): cuda-toolkit-config-common-12.4.127- 1.7 MB/s | 7.9 kB 00:00 (289/590): cuda-nvvm-12-3-12.3.107-1.x86_64.rpm 47 MB/s | 26 MB 00:00 (290/590): libcurand-12-3-10.3.4.107-1.x86_64.r 223 MB/s | 53 MB 00:00 (291/590): libcurand-devel-12-3-10.3.4.107-1.x8 187 MB/s | 53 MB 00:00 (292/590): libnccl-devel-2.21.5-1+cuda12.4.x86_ 315 kB/s | 16 kB 00:00 (293/590): libnvjitlink-12-3-12.3.101-1.x86_64. 268 MB/s | 19 MB 00:00 (294/590): libnvjitlink-devel-12-3-12.3.101-1.x 185 MB/s | 18 MB 00:00 (295/590): MUMPS-common-5.5.1-1.fc38.noarch.rpm 19 MB/s | 830 kB 00:00 (296/590): SuperLU-5.3.0-4.fc38.x86_64.rpm 31 MB/s | 183 kB 00:00 (297/590): MUMPS-5.5.1-1.fc38.x86_64.rpm 24 MB/s | 2.0 MB 00:00 (298/590): arpack-3.8.0-6.fc38.x86_64.rpm 21 MB/s | 206 kB 00:00 (299/590): byte-buddy-1.12.10-3.fc38.noarch.rpm 85 MB/s | 2.9 MB 00:00 (300/590): byte-buddy-agent-1.12.10-3.fc38.noar 1.8 MB/s | 65 kB 00:00 (301/590): cdparanoia-libs-10.2-41.fc38.x86_64. 9.9 MB/s | 54 kB 00:00 (302/590): cfitsio-4.2.0-3.fc38.x86_64.rpm 57 MB/s | 607 kB 00:00 (303/590): libnccl-2.21.5-1+cuda12.4.x86_64.rpm 229 MB/s | 130 MB 00:00 (304/590): cgnslib-libs-4.3.0-7.fc38.x86_64.rpm 2.4 MB/s | 296 kB 00:00 (305/590): ceres-solver-2.1.0-5.fc38.x86_64.rpm 5.1 MB/s | 720 kB 00:00 (306/590): cliquer-libs-1.22-5.fc38.x86_64.rpm 1.2 MB/s | 38 kB 00:00 (307/590): cjson-1.7.14-7.fc38.x86_64.rpm 518 kB/s | 31 kB 00:00 (308/590): coin-or-Cbc-2.10.5-12.fc38.x86_64.rp 22 MB/s | 833 kB 00:00 (309/590): codec2-1.0.5-2.fc38.x86_64.rpm 8.8 MB/s | 641 kB 00:00 (310/590): coin-or-Cgl-0.60.3-9.fc38.x86_64.rpm 22 MB/s | 432 kB 00:00 (311/590): coin-or-Clp-1.17.6-12.fc38.x86_64.rp 37 MB/s | 938 kB 00:00 (312/590): copy-jdk-configs-4.1-2.fc38.noarch.r 9.5 MB/s | 28 kB 00:00 (313/590): dbus-broker-33-1.fc38.x86_64.rpm 29 MB/s | 173 kB 00:00 (314/590): coin-or-Osi-0.108.6-8.fc38.x86_64.rp 12 MB/s | 320 kB 00:00 (315/590): double-conversion-3.1.5-8.fc38.x86_6 7.2 MB/s | 49 kB 00:00 (316/590): coin-or-CoinUtils-2.11.4-9.fc38.x86_ 12 MB/s | 482 kB 00:00 (317/590): fdk-aac-free-2.0.0-10.fc38.x86_64.rp 36 MB/s | 336 kB 00:00 (318/590): eigen3-devel-3.4.0-9.fc38.noarch.rpm 20 MB/s | 1.2 MB 00:00 (319/590): flatbuffers-23.3.3-1.fc38.x86_64.rpm 3.3 MB/s | 197 kB 00:00 (320/590): freexl-1.0.6-21.fc38.x86_64.rpm 8.1 MB/s | 35 kB 00:00 (321/590): game-music-emu-0.6.3-11.fc38.x86_64. 24 MB/s | 157 kB 00:00 (322/590): gdk-pixbuf2-modules-2.42.10-2.fc38.x 14 MB/s | 85 kB 00:00 (323/590): flatbuffers-devel-23.3.3-1.fc38.x86_ 2.0 MB/s | 107 kB 00:00 (324/590): flatbuffers-compiler-23.3.3-1.fc38.x 8.0 MB/s | 1.0 MB 00:00 (325/590): geos-3.11.1-3.fc38.x86_64.rpm 22 MB/s | 994 kB 00:00 (326/590): gecode-6.2.0-11.fc38.x86_64.rpm 44 MB/s | 3.2 MB 00:00 (327/590): gflags-2.2.2-11.fc38.x86_64.rpm 3.4 MB/s | 93 kB 00:00 (328/590): gflags-devel-2.2.2-11.fc38.x86_64.rp 706 kB/s | 24 kB 00:00 (329/590): glog-0.3.5-17.fc38.x86_64.rpm 1.4 MB/s | 68 kB 00:00 (330/590): gl-manpages-1.1-26.20190306.fc38.noa 20 MB/s | 1.2 MB 00:00 (331/590): glx-utils-8.5.0-1.fc38.x86_64.rpm 4.7 MB/s | 40 kB 00:00 (332/590): glog-devel-0.3.5-17.fc38.x86_64.rpm 978 kB/s | 38 kB 00:00 (333/590): gmp-c++-6.2.1-4.fc38.x86_64.rpm 2.6 MB/s | 18 kB 00:00 (334/590): gmp-devel-6.2.1-4.fc38.x86_64.rpm 26 MB/s | 173 kB 00:00 (335/590): graphene-1.10.6-5.fc38.x86_64.rpm 5.4 MB/s | 62 kB 00:00 (336/590): glpk-5.0-6.fc38.x86_64.rpm 12 MB/s | 387 kB 00:00 (337/590): gsm-1.0.22-2.fc38.x86_64.rpm 2.3 MB/s | 35 kB 00:00 (338/590): hdf-libs-4.2.15-12.fc38.x86_64.rpm 11 MB/s | 294 kB 00:00 (339/590): hdf5-1.12.1-11.fc38.x86_64.rpm 58 MB/s | 2.2 MB 00:00 (340/590): gsl-2.7.1-4.fc38.x86_64.rpm 16 MB/s | 1.1 MB 00:00 (341/590): hiredis-1.0.2-4.fc38.x86_64.rpm 1.1 MB/s | 42 kB 00:00 (342/590): ilbc-3.0.4-4.fc38.x86_64.rpm 4.1 MB/s | 53 kB 00:00 (343/590): infiniband-diags-44.0-3.fc38.x86_64. 23 MB/s | 329 kB 00:00 (344/590): hiredis-devel-1.0.2-4.fc38.x86_64.rp 1.2 MB/s | 37 kB 00:00 (345/590): isl-0.16.1-17.fc38.x86_64.rpm 62 MB/s | 853 kB 00:00 (346/590): javapackages-filesystem-6.1.0-7.fc38 1.9 MB/s | 13 kB 00:00 (347/590): javapackages-tools-6.1.0-7.fc38.noar 1.6 MB/s | 37 kB 00:00 (348/590): kmod-libs-30-4.fc38.x86_64.rpm 12 MB/s | 68 kB 00:00 (349/590): jacop-4.9.0-1.fc38.noarch.rpm 37 MB/s | 1.7 MB 00:00 (350/590): lame-libs-3.100-14.fc38.x86_64.rpm 40 MB/s | 337 kB 00:00 (351/590): iso-codes-4.13.0-1.fc38.noarch.rpm 48 MB/s | 3.5 MB 00:00 (352/590): leveldb-devel-1.23-6.fc38.x86_64.rpm 1.1 MB/s | 53 kB 00:00 (353/590): libXau-devel-1.0.11-2.fc38.x86_64.rp 5.3 MB/s | 14 kB 00:00 (354/590): libGLEW-2.2.0-4.fc38.x86_64.rpm 5.3 MB/s | 175 kB 00:00 (355/590): libXcursor-1.2.1-3.fc38.x86_64.rpm 8.1 MB/s | 30 kB 00:00 (356/590): libXfixes-6.0.0-5.fc38.x86_64.rpm 3.8 MB/s | 19 kB 00:00 (357/590): leveldb-1.23-6.fc38.x86_64.rpm 2.5 MB/s | 151 kB 00:00 (358/590): libXv-1.0.11-18.fc38.x86_64.rpm 2.6 MB/s | 18 kB 00:00 (359/590): libXxf86vm-1.1.5-2.fc38.x86_64.rpm 4.6 MB/s | 18 kB 00:00 (360/590): libaec-1.0.6-4.fc38.x86_64.rpm 3.9 MB/s | 42 kB 00:00 (361/590): libchromaprint-1.5.1-8.fc38.x86_64.r 4.0 MB/s | 39 kB 00:00 (362/590): libcom_err-devel-1.46.5-4.fc38.x86_6 3.2 MB/s | 15 kB 00:00 (363/590): libgeotiff-1.7.1-6.fc38.x86_64.rpm 18 MB/s | 106 kB 00:00 (364/590): libdc1394-2.2.6-9.fc38.x86_64.rpm 14 MB/s | 130 kB 00:00 (365/590): libglvnd-1.6.0-2.fc38.x86_64.rpm 21 MB/s | 134 kB 00:00 (366/590): libglvnd-core-devel-1.6.0-2.fc38.x86 2.6 MB/s | 18 kB 00:00 (367/590): libglvnd-devel-1.6.0-2.fc38.x86_64.r 28 MB/s | 163 kB 00:00 (368/590): libglvnd-egl-1.6.0-2.fc38.x86_64.rpm 7.0 MB/s | 36 kB 00:00 (369/590): libglvnd-glx-1.6.0-2.fc38.x86_64.rpm 26 MB/s | 142 kB 00:00 (370/590): libglvnd-opengl-1.6.0-2.fc38.x86_64. 9.3 MB/s | 43 kB 00:00 (371/590): libgta-1.2.1-9.fc38.x86_64.rpm 11 MB/s | 35 kB 00:00 (372/590): libglvnd-gles-1.6.0-2.fc38.x86_64.rp 1.9 MB/s | 32 kB 00:00 (373/590): libgudev-237-4.fc38.x86_64.rpm 5.6 MB/s | 35 kB 00:00 (374/590): libibumad-44.0-3.fc38.x86_64.rpm 8.5 MB/s | 27 kB 00:00 (375/590): libibverbs-44.0-3.fc38.x86_64.rpm 64 MB/s | 429 kB 00:00 (376/590): libharu-2.4.3-2.fc38.x86_64.rpm 23 MB/s | 580 kB 00:00 (377/590): libkml-1.3.0-43.fc38.x86_64.rpm 35 MB/s | 355 kB 00:00 (378/590): libbluray-1.3.4-2.fc38.x86_64.rpm 1.9 MB/s | 173 kB 00:00 (379/590): libldb-2.7.2-1.fc38.x86_64.rpm 8.2 MB/s | 180 kB 00:00 (380/590): libmodplug-0.8.9.0-16.fc38.x86_64.rp 9.4 MB/s | 176 kB 00:00 (381/590): libogg-1.3.5-5.fc38.x86_64.rpm 2.2 MB/s | 33 kB 00:00 (382/590): libnl3-3.7.0-3.fc38.x86_64.rpm 17 MB/s | 345 kB 00:00 (383/590): libpciaccess-0.16-8.fc38.x86_64.rpm 1.5 MB/s | 26 kB 00:00 (384/590): libproxy-0.4.18-6.fc38.x86_64.rpm 5.1 MB/s | 71 kB 00:00 (385/590): libqhull_r-7.2.1-12.fc38.x86_64.rpm 15 MB/s | 167 kB 00:00 (386/590): libraw1394-2.1.2-17.fc38.x86_64.rpm 6.4 MB/s | 64 kB 00:00 (387/590): librdmacm-44.0-3.fc38.x86_64.rpm 5.3 MB/s | 72 kB 00:00 (388/590): libicu-72.1-2.fc38.x86_64.rpm 67 MB/s | 10 MB 00:00 (389/590): librist-0.2.7-1.fc38.x86_64.rpm 1.6 MB/s | 77 kB 00:00 (390/590): libseccomp-2.5.3-4.fc38.x86_64.rpm 15 MB/s | 71 kB 00:00 (391/590): librttopo-1.1.0-11.fc38.x86_64.rpm 24 MB/s | 208 kB 00:00 (392/590): libselinux-devel-3.5-1.fc38.x86_64.r 36 MB/s | 151 kB 00:00 (393/590): libsepol-devel-3.5-1.fc38.x86_64.rpm 17 MB/s | 49 kB 00:00 (394/590): libsodium-1.0.18-11.fc38.x86_64.rpm 39 MB/s | 162 kB 00:00 (395/590): libspatialite-5.0.1-20.fc38.x86_64.r 68 MB/s | 3.1 MB 00:00 (396/590): libtalloc-2.4.0-2.fc38.x86_64.rpm 3.8 MB/s | 31 kB 00:00 (397/590): libtdb-1.4.8-1.fc38.x86_64.rpm 13 MB/s | 51 kB 00:00 (398/590): libtevent-0.14.1-1.fc38.x86_64.rpm 13 MB/s | 45 kB 00:00 (399/590): libtheora-1.1.1-33.fc38.x86_64.rpm 33 MB/s | 166 kB 00:00 (400/590): libsodium-devel-1.0.18-11.fc38.x86_6 12 MB/s | 1.1 MB 00:00 (401/590): libunwind-1.6.2-7.fc38.x86_64.rpm 14 MB/s | 68 kB 00:00 (402/590): libudfread-1.1.2-5.fc38.x86_64.rpm 1.0 MB/s | 35 kB 00:00 (403/590): librabbitmq-0.13.0-1.fc38.x86_64.rpm 209 kB/s | 43 kB 00:00 (404/590): libunwind-devel-1.6.2-7.fc38.x86_64. 3.5 MB/s | 88 kB 00:00 (405/590): libvdpau-1.5-3.fc38.x86_64.rpm 2.0 MB/s | 16 kB 00:00 (406/590): libverto-devel-0.3.2-5.fc38.x86_64.r 1.8 MB/s | 14 kB 00:00 (407/590): libvisual-0.4.1-1.fc38.x86_64.rpm 28 MB/s | 151 kB 00:00 (408/590): libvorbis-1.3.7-7.fc38.x86_64.rpm 31 MB/s | 195 kB 00:00 (409/590): libxkbcommon-1.5.0-2.fc38.x86_64.rpm 11 MB/s | 140 kB 00:00 (410/590): libva-2.18.0-1.fc38.x86_64.rpm 2.2 MB/s | 105 kB 00:00 (411/590): libxcb-devel-1.13.1-11.fc38.x86_64.r 66 MB/s | 1.4 MB 00:00 (412/590): libxkbcommon-x11-1.5.0-2.fc38.x86_64 1.0 MB/s | 22 kB 00:00 (413/590): libxshmfence-1.3-12.fc38.x86_64.rpm 573 kB/s | 12 kB 00:00 (414/590): libyaml-0.2.5-9.fc38.x86_64.rpm 2.8 MB/s | 59 kB 00:00 (415/590): lksctp-tools-1.0.19-3.fc38.x86_64.rp 2.9 MB/s | 92 kB 00:00 (416/590): lua-5.4.4-9.fc38.x86_64.rpm 7.1 MB/s | 190 kB 00:00 (417/590): lua-filesystem-1.8.0-8.fc38.x86_64.r 729 kB/s | 34 kB 00:00 (418/590): lua-json-1.3.4-3.fc38.noarch.rpm 643 kB/s | 30 kB 00:00 (419/590): lua-posix-35.1-5.fc38.x86_64.rpm 3.2 MB/s | 138 kB 00:00 (420/590): lua-lpeg-1.0.2-10.fc38.x86_64.rpm 717 kB/s | 67 kB 00:00 (421/590): lua-term-0.07-17.fc38.x86_64.rpm 217 kB/s | 15 kB 00:00 (422/590): miniz-3.0.2-2.fc38.x86_64.rpm 1.2 MB/s | 65 kB 00:00 (423/590): miniz-devel-3.0.2-2.fc38.x86_64.rpm 623 kB/s | 33 kB 00:00 (424/590): mockito-3.12.4-6.fc38.noarch.rpm 13 MB/s | 583 kB 00:00 (425/590): mp-3.1.0-41.20200303git7fd4828.fc38. 30 MB/s | 978 kB 00:00 (426/590): mpg123-libs-1.31.3-1.fc38.x86_64.rpm 38 MB/s | 340 kB 00:00 (427/590): lpcnetfreedv-0.2-13.fc38.x86_64.rpm 25 MB/s | 7.3 MB 00:00 (428/590): mpfr-devel-4.1.1-3.fc38.x86_64.rpm 588 kB/s | 21 kB 00:00 (429/590): netcdf-4.9.0-5.fc38.x86_64.rpm 57 MB/s | 833 kB 00:00 (430/590): numactl-libs-2.0.16-2.fc38.x86_64.rp 3.2 MB/s | 31 kB 00:00 (431/590): numactl-devel-2.0.16-2.fc38.x86_64.r 610 kB/s | 22 kB 00:00 (432/590): mtdev-1.1.6-5.fc38.x86_64.rpm 373 kB/s | 21 kB 00:00 (433/590): objectweb-asm-9.3-5.fc38.noarch.rpm 16 MB/s | 355 kB 00:00 (434/590): ogdi-4.1.0-10.fc38.x86_64.rpm 19 MB/s | 244 kB 00:00 (435/590): openblas-0.3.21-4.fc38.x86_64.rpm 2.1 MB/s | 35 kB 00:00 (436/590): openblas-openmp-0.3.21-4.fc38.x86_64 180 MB/s | 5.1 MB 00:00 (437/590): objenesis-3.3-2.fc38.noarch.rpm 2.3 MB/s | 116 kB 00:00 (438/590): openblas-devel-0.3.21-4.fc38.x86_64. 1.5 MB/s | 81 kB 00:00 (439/590): openblas-openmp64-0.3.21-4.fc38.x86_ 112 MB/s | 4.9 MB 00:00 (440/590): openblas-serial-0.3.21-4.fc38.x86_64 98 MB/s | 4.9 MB 00:00 (441/590): openblas-openmp64_-0.3.21-4.fc38.x86 27 MB/s | 4.9 MB 00:00 (442/590): openblas-serial64-0.3.21-4.fc38.x86_ 31 MB/s | 4.8 MB 00:00 (443/590): openblas-serial64_-0.3.21-4.fc38.x86 28 MB/s | 4.8 MB 00:00 (444/590): openblas-threads-0.3.21-4.fc38.x86_6 55 MB/s | 5.1 MB 00:00 (445/590): opencore-amr-0.1.6-3.fc38.x86_64.rpm 4.0 MB/s | 177 kB 00:00 (446/590): openpgm-5.2.122-31.fc38.x86_64.rpm 40 MB/s | 176 kB 00:00 (447/590): openblas-threads64_-0.3.21-4.fc38.x8 54 MB/s | 4.9 MB 00:00 (448/590): openslide-3.4.1-23.fc38.x86_64.rpm 5.3 MB/s | 106 kB 00:00 (449/590): openblas-threads64-0.3.21-4.fc38.x86 32 MB/s | 4.9 MB 00:00 (450/590): opus-1.3.1-12.fc38.x86_64.rpm 44 MB/s | 206 kB 00:00 (451/590): opentest4j-1.2.0-12.fc38.noarch.rpm 1.4 MB/s | 24 kB 00:00 (452/590): orc-0.4.33-2.fc38.x86_64.rpm 39 MB/s | 202 kB 00:00 (453/590): pcre-8.45-1.fc38.3.x86_64.rpm 28 MB/s | 201 kB 00:00 (454/590): pcre2-devel-10.42-1.fc38.1.x86_64.rp 55 MB/s | 506 kB 00:00 (455/590): pcre2-utf32-10.42-1.fc38.1.x86_64.rp 63 MB/s | 201 kB 00:00 (456/590): pcre2-utf16-10.42-1.fc38.1.x86_64.rp 26 MB/s | 214 kB 00:00 (457/590): proj-9.1.1-1.fc38.x86_64.rpm 165 MB/s | 1.4 MB 00:00 (458/590): openpgm-devel-5.2.122-31.fc38.x86_64 957 kB/s | 67 kB 00:00 (459/590): proj-data-9.1.1-1.fc38.noarch.rpm 86 MB/s | 1.2 MB 00:00 (460/590): protobuf-3.19.6-2.fc38.x86_64.rpm 86 MB/s | 1.0 MB 00:00 (461/590): python3-pyyaml-6.0-6.fc38.x86_64.rpm 7.3 MB/s | 225 kB 00:00 (462/590): python3-typing-extensions-4.5.0-1.fc 15 MB/s | 63 kB 00:00 (463/590): rdma-core-devel-44.0-3.fc38.x86_64.r 60 MB/s | 418 kB 00:00 (464/590): python3-pybind11-2.10.3-2.fc38.x86_6 3.9 MB/s | 194 kB 00:00 (465/590): pugixml-1.13-2.fc38.x86_64.rpm 1.4 MB/s | 100 kB 00:00 (466/590): scotch-6.1.2-3.fc37.x86_64.rpm 16 MB/s | 397 kB 00:00 (467/590): snappy-1.1.9-7.fc38.x86_64.rpm 5.0 MB/s | 36 kB 00:00 (468/590): rocksdb-7.8.3-1.fc38.x86_64.rpm 31 MB/s | 2.8 MB 00:00 (469/590): snappy-devel-1.1.9-7.fc38.x86_64.rpm 567 kB/s | 21 kB 00:00 (470/590): speex-1.2.0-13.fc38.x86_64.rpm 11 MB/s | 67 kB 00:00 (471/590): rocksdb-devel-7.8.3-1.fc38.x86_64.rp 2.9 MB/s | 285 kB 00:00 (472/590): soxr-0.1.3-13.fc38.x86_64.rpm 9.1 MB/s | 84 kB 00:00 (473/590): suitesparse-5.13.0-2.fc38.x86_64.rpm 104 MB/s | 1.1 MB 00:00 (474/590): tbb-devel-2020.3-16.fc38.x86_64.rpm 13 MB/s | 335 kB 00:00 (475/590): tcl-8.6.12-4.fc38.x86_64.rpm 46 MB/s | 1.1 MB 00:00 (476/590): twolame-libs-0.4.0-2.fc38.x86_64.rpm 7.8 MB/s | 69 kB 00:00 (477/590): unixODBC-2.3.11-2.fc38.x86_64.rpm 81 MB/s | 483 kB 00:00 (478/590): uriparser-0.9.7-2.fc38.x86_64.rpm 8.5 MB/s | 60 kB 00:00 (479/590): vapoursynth-libs-58-4.fc38.x86_64.rp 35 MB/s | 544 kB 00:00 (480/590): vo-amrwbenc-0.1.3-18.fc38.x86_64.rpm 3.5 MB/s | 80 kB 00:00 (481/590): xcb-util-0.4.1-2.fc38.x86_64.rpm 1.1 MB/s | 19 kB 00:00 (482/590): tbb-2020.3-16.fc38.x86_64.rpm 2.0 MB/s | 169 kB 00:00 (483/590): xcb-util-image-0.4.1-2.fc38.x86_64.r 5.0 MB/s | 19 kB 00:00 (484/590): xcb-util-keysyms-0.4.1-2.fc38.x86_64 2.7 MB/s | 14 kB 00:00 (485/590): xcb-util-renderutil-0.3.10-2.fc38.x8 3.8 MB/s | 17 kB 00:00 (486/590): xcb-util-wm-0.4.2-2.fc38.x86_64.rpm 6.6 MB/s | 31 kB 00:00 (487/590): xkeyboard-config-2.38-1.fc38.noarch. 114 MB/s | 963 kB 00:00 (488/590): xorg-x11-proto-devel-2022.2-3.fc38.n 41 MB/s | 299 kB 00:00 (489/590): zeromq-4.3.4-5.fc38.x86_64.rpm 64 MB/s | 459 kB 00:00 (490/590): xvidcore-1.3.7-9.fc38.x86_64.rpm 19 MB/s | 268 kB 00:00 (491/590): zvbi-0.2.35-19.fc38.x86_64.rpm 12 MB/s | 419 kB 00:00 (492/590): zeromq-devel-4.3.4-5.fc38.x86_64.rpm 425 kB/s | 16 kB 00:00 (493/590): Lmod-8.7.32-1.fc38.x86_64.rpm 22 MB/s | 261 kB 00:00 (494/590): alsa-lib-1.2.11-2.fc38.x86_64.rpm 41 MB/s | 520 kB 00:00 (495/590): blosc-1.21.5-2.fc38.x86_64.rpm 11 MB/s | 59 kB 00:00 (496/590): armadillo-12.8.1-1.fc38.x86_64.rpm 4.3 MB/s | 32 kB 00:00 (497/590): dbus-1.14.10-1.fc38.x86_64.rpm 2.5 MB/s | 8.0 kB 00:00 (498/590): dbus-common-1.14.10-1.fc38.noarch.rp 4.3 MB/s | 15 kB 00:00 (499/590): fftw-3.3.10-10.fc38.x86_64.rpm 13 MB/s | 46 kB 00:00 (500/590): fftw-devel-3.3.10-10.fc38.x86_64.rpm 22 MB/s | 135 kB 00:00 (501/590): fftw-libs-3.3.10-10.fc38.x86_64.rpm 1.0 MB/s | 8.0 kB 00:00 (502/590): fftw-libs-long-3.3.10-10.fc38.x86_64 38 MB/s | 505 kB 00:00 (503/590): fftw-libs-double-3.3.10-10.fc38.x86_ 61 MB/s | 1.2 MB 00:00 (504/590): fftw-libs-quad-3.3.10-10.fc38.x86_64 88 MB/s | 740 kB 00:00 (505/590): fftw-libs-single-3.3.10-10.fc38.x86_ 70 MB/s | 1.2 MB 00:00 (506/590): flexiblas-3.4.2-1.fc38.x86_64.rpm 2.2 MB/s | 25 kB 00:00 (507/590): vtk-9.2.5-2.fc38.x86_64.rpm 125 MB/s | 24 MB 00:00 (508/590): flexiblas-netlib64-3.4.2-1.fc38.x86_ 69 MB/s | 3.0 MB 00:00 (509/590): flexiblas-netlib-3.4.2-1.fc38.x86_64 66 MB/s | 3.1 MB 00:00 (510/590): flexiblas-openblas-openmp64-3.4.2-1. 3.2 MB/s | 17 kB 00:00 (511/590): flexiblas-openblas-openmp-3.4.2-1.fc 1.4 MB/s | 17 kB 00:00 (512/590): giflib-5.2.2-1.fc38.x86_64.rpm 5.3 MB/s | 52 kB 00:00 (513/590): gstreamer1-1.22.9-1.fc38.x86_64.rpm 47 MB/s | 1.4 MB 00:00 (514/590): gstreamer1-plugins-base-1.22.9-1.fc3 54 MB/s | 2.2 MB 00:00 (515/590): hwdata-0.380-1.fc38.noarch.rpm 58 MB/s | 1.6 MB 00:00 (516/590): imath-3.1.10-1.fc38.x86_64.rpm 10 MB/s | 97 kB 00:00 (517/590): json-c-0.17-1.fc38.x86_64.rpm 6.0 MB/s | 43 kB 00:00 (518/590): keyutils-libs-devel-1.6.3-1.fc38.x86 6.1 MB/s | 60 kB 00:00 (519/590): krb5-devel-1.21-3.fc38.x86_64.rpm 13 MB/s | 144 kB 00:00 (520/590): gdal-libs-3.6.4-2.fc38.x86_64.rpm 74 MB/s | 8.2 MB 00:00 (521/590): libX11-devel-1.8.7-1.fc38.x86_64.rpm 43 MB/s | 1.0 MB 00:00 (522/590): libX11-xcb-1.8.7-1.fc38.x86_64.rpm 1.8 MB/s | 12 kB 00:00 (523/590): libXi-1.8.1-1.fc38.x86_64.rpm 6.6 MB/s | 40 kB 00:00 (524/590): libavformat-free-6.0.1-2.fc38.x86_64 59 MB/s | 1.1 MB 00:00 (525/590): libavutil-free-6.0.1-2.fc38.x86_64.r 25 MB/s | 343 kB 00:00 (526/590): libdrm-2.4.120-1.fc38.x86_64.rpm 7.7 MB/s | 157 kB 00:00 (527/590): libavcodec-free-6.0.1-2.fc38.x86_64. 65 MB/s | 4.0 MB 00:00 (528/590): libevdev-1.13.1-1.fc38.x86_64.rpm 4.4 MB/s | 44 kB 00:00 (529/590): libgcrypt-1.10.2-1.fc38.x86_64.rpm 40 MB/s | 514 kB 00:00 (530/590): libgpg-error-1.47-1.fc38.x86_64.rpm 15 MB/s | 230 kB 00:00 (531/590): libgfortran-13.2.1-7.fc38.x86_64.rpm 27 MB/s | 910 kB 00:00 (532/590): libinput-1.23.0-2.fc38.x86_64.rpm 11 MB/s | 213 kB 00:00 (533/590): libkadm5-1.21-3.fc38.x86_64.rpm 5.4 MB/s | 78 kB 00:00 (534/590): libopenmpt-0.6.12-1.fc38.x86_64.rpm 54 MB/s | 637 kB 00:00 (535/590): libpq-15.3-1.fc38.x86_64.rpm 12 MB/s | 215 kB 00:00 (536/590): libquadmath-13.2.1-7.fc38.x86_64.rpm 14 MB/s | 200 kB 00:00 (537/590): libsmbclient-4.18.11-1.fc38.x86_64.r 8.5 MB/s | 76 kB 00:00 (538/590): libnauty-2.8.6-5.fc38.x86_64.rpm 10 MB/s | 603 kB 00:00 (539/590): libswresample-free-6.0.1-2.fc38.x86_ 7.0 MB/s | 69 kB 00:00 (540/590): libswscale-free-6.0.1-2.fc38.x86_64. 18 MB/s | 191 kB 00:00 (541/590): liburing-2.4-2.fc38.x86_64.rpm 4.2 MB/s | 38 kB 00:00 (542/590): libusb1-1.0.27-1.fc38.x86_64.rpm 9.2 MB/s | 76 kB 00:00 (543/590): libuv-devel-1.48.0-1.fc38.x86_64.rpm 6.4 MB/s | 42 kB 00:00 (544/590): libuv-static-1.48.0-1.fc38.x86_64.rp 14 MB/s | 106 kB 00:00 (545/590): libvpl-2.10.2-1.fc38.x86_64.rpm 22 MB/s | 174 kB 00:00 (546/590): libvpx-1.13.1-1.fc38.x86_64.rpm 74 MB/s | 1.1 MB 00:00 (547/590): libwacom-2.8.0-1.fc38.x86_64.rpm 4.2 MB/s | 42 kB 00:00 (548/590): libwacom-data-2.8.0-1.fc38.noarch.rp 33 MB/s | 191 kB 00:00 (549/590): libwayland-client-1.22.0-1.fc38.x86_ 4.9 MB/s | 34 kB 00:00 (550/590): libwayland-cursor-1.22.0-1.fc38.x86_ 3.3 MB/s | 19 kB 00:00 (551/590): libwayland-egl-1.22.0-1.fc38.x86_64. 1.5 MB/s | 13 kB 00:00 (552/590): libwayland-server-1.22.0-1.fc38.x86_ 3.0 MB/s | 42 kB 00:00 (553/590): libwbclient-4.18.11-1.fc38.x86_64.rp 4.1 MB/s | 46 kB 00:00 (554/590): libzstd-devel-1.5.5-1.fc38.x86_64.rp 1.2 MB/s | 51 kB 00:00 (555/590): lmdb-0.9.32-1.fc38.x86_64.rpm 2.9 MB/s | 32 kB 00:00 (556/590): lmdb-devel-0.9.32-1.fc38.x86_64.rpm 4.0 MB/s | 26 kB 00:00 (557/590): lmdb-libs-0.9.32-1.fc38.x86_64.rpm 3.6 MB/s | 61 kB 00:00 (558/590): mariadb-connector-c-3.3.8-1.fc38.x86 18 MB/s | 214 kB 00:00 (559/590): mariadb-connector-c-config-3.3.8-1.f 395 kB/s | 8.6 kB 00:00 (560/590): mbedtls-2.28.7-1.fc38.x86_64.rpm 24 MB/s | 402 kB 00:00 (561/590): mesa-filesystem-23.1.9-1.fc38.x86_64 1.3 MB/s | 17 kB 00:00 (562/590): mesa-libEGL-23.1.9-1.fc38.x86_64.rpm 15 MB/s | 131 kB 00:00 (563/590): java-17-openjdk-headless-17.0.9.0.9- 91 MB/s | 44 MB 00:00 (564/590): mesa-libGL-23.1.9-1.fc38.x86_64.rpm 3.4 MB/s | 173 kB 00:00 (565/590): mesa-libGLU-9.0.3-1.fc38.x86_64.rpm 11 MB/s | 160 kB 00:00 (566/590): mesa-libgbm-23.1.9-1.fc38.x86_64.rpm 6.0 MB/s | 44 kB 00:00 (567/590): mesa-libGLU-devel-9.0.3-1.fc38.x86_6 556 kB/s | 12 kB 00:00 (568/590): minizip-ng-3.0.7-4.fc38.x86_64.rpm 13 MB/s | 69 kB 00:00 (569/590): mesa-libglapi-23.1.9-1.fc38.x86_64.r 8.0 MB/s | 53 kB 00:00 (570/590): ocl-icd-2.3.2-1.fc38.x86_64.rpm 9.5 MB/s | 66 kB 00:00 (571/590): llvm-libs-16.0.6-3.fc38.x86_64.rpm 104 MB/s | 27 MB 00:00 (572/590): ocl-icd-devel-2.3.2-1.fc38.x86_64.rp 1.9 MB/s | 63 kB 00:00 (573/590): opencl-headers-3.0-18.20231003git9ce 3.3 MB/s | 87 kB 00:00 (574/590): python3-numpy-1.24.4-1.fc38.x86_64.r 193 MB/s | 7.9 MB 00:00 (575/590): openexr-libs-3.1.10-1.fc38.x86_64.rp 26 MB/s | 1.1 MB 00:00 (576/590): procps-ng-3.3.17-11.fc38.x86_64.rpm 7.0 MB/s | 338 kB 00:00 (577/590): qt5-qtbase-5.15.12-5.fc38.x86_64.rpm 233 MB/s | 3.6 MB 00:00 (578/590): qt5-qtbase-common-5.15.12-5.fc38.noa 928 kB/s | 12 kB 00:00 (579/590): qt-settings-38.3-1.fc38.noarch.rpm 278 kB/s | 9.2 kB 00:00 (580/590): samba-common-4.18.11-1.fc38.noarch.r 29 MB/s | 150 kB 00:00 (581/590): samba-common-libs-4.18.11-1.fc38.x86 12 MB/s | 108 kB 00:00 (582/590): qt5-qtbase-gui-5.15.12-5.fc38.x86_64 114 MB/s | 6.4 MB 00:00 (583/590): samba-client-libs-4.18.11-1.fc38.x86 89 MB/s | 5.3 MB 00:00 (584/590): systemd-pam-253.17-1.fc38.x86_64.rpm 54 MB/s | 337 kB 00:00 (585/590): systemd-253.17-1.fc38.x86_64.rpm 208 MB/s | 4.5 MB 00:00 (586/590): systemd-rpm-macros-253.17-1.fc38.noa 2.2 MB/s | 25 kB 00:00 (587/590): tzdata-java-2024a-1.fc38.noarch.rpm 47 MB/s | 207 kB 00:00 (588/590): zimg-3.0.5-1.fc38.x86_64.rpm 40 MB/s | 290 kB 00:00 (589/590): xerces-c-3.2.5-1.fc38.x86_64.rpm 83 MB/s | 969 kB 00:00 (590/590): srt-libs-1.5.2-1.fc38.x86_64.rpm 3.3 MB/s | 370 kB 00:00 -------------------------------------------------------------------------------- Total 176 MB/s | 2.3 GB 00:13 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Running scriptlet: copy-jdk-configs-4.1-2.fc38.noarch 1/1 Running scriptlet: java-17-openjdk-headless-1:17.0.9.0.9-3.fc38.x86_64 1/1 Preparing : 1/1 Installing : cmake-filesystem-3.27.7-1.fc38.x86_64 1/590 Installing : libpng-2:1.6.37-14.fc38.x86_64 2/590 Installing : libgfortran-13.2.1-7.fc38.x86_64 3/590 Installing : expat-2.6.0-1.fc38.x86_64 4/590 Installing : libjpeg-turbo-2.1.4-2.fc38.x86_64 5/590 Installing : openblas-0.3.21-4.fc38.x86_64 6/590 Installing : javapackages-filesystem-6.1.0-7.fc38.noarch 7/590 Installing : cuda-toolkit-config-common-12.4.127-1.noarch 8/590 Installing : cuda-toolkit-12-config-common-12.4.127-1.noarch 9/590 Installing : cuda-toolkit-12-3-config-common-12.3.101-1.noarc 10/590 Installing : openjpeg2-2.5.2-1.fc38.x86_64 11/590 Installing : nspr-4.35.0-17.fc38.x86_64 12/590 Installing : libwebp-1.3.2-2.fc38.x86_64 13/590 Installing : libX11-xcb-1.8.7-1.fc38.x86_64 14/590 Installing : libcublas-12-3-12.3.4.1-2.x86_64 15/590 Running scriptlet: libcublas-12-3-12.3.4.1-2.x86_64 15/590 Installing : libwayland-client-1.22.0-1.fc38.x86_64 16/590 Installing : snappy-1.1.9-7.fc38.x86_64 17/590 Installing : libtalloc-2.4.0-2.fc38.x86_64 18/590 Installing : libogg-2:1.3.5-5.fc38.x86_64 19/590 Installing : libglvnd-1:1.6.0-2.fc38.x86_64 20/590 Installing : libglvnd-opengl-1:1.6.0-2.fc38.x86_64 21/590 Installing : nss-util-3.99.0-1.fc38.x86_64 22/590 Installing : cuda-cudart-12-3-12.3.101-1.x86_64 23/590 Running scriptlet: cuda-cudart-12-3-12.3.101-1.x86_64 23/590 Installing : libuv-1:1.48.0-1.fc38.x86_64 24/590 Installing : libquadmath-13.2.1-7.fc38.x86_64 25/590 Installing : libmpc-1.3.1-2.fc38.x86_64 26/590 Installing : gflags-2.2.2-11.fc38.x86_64 27/590 Installing : fonts-filesystem-1:2.0.5-11.fc38.noarch 28/590 Installing : urw-base35-fonts-common-20200910-16.fc38.noarch 29/590 Installing : protobuf-compat-3.21.9-2.fc38.x86_64 30/590 Installing : cpuinfo-1:0-20240327.0.gitf42f5eaf.fc38.x86_64 31/590 Installing : libtheora-1:1.1.1-33.fc38.x86_64 32/590 Installing : libvorbis-1:1.3.7-7.fc38.x86_64 33/590 Installing : libtevent-0.14.1-1.fc38.x86_64 34/590 Installing : openblas-openmp-0.3.21-4.fc38.x86_64 35/590 Installing : lmdb-libs-0.9.32-1.fc38.x86_64 36/590 Installing : python-rpm-macros-3.11-10.fc38.noarch 37/590 Installing : lua-5.4.4-9.fc38.x86_64 38/590 Installing : libunwind-1.6.2-7.fc38.x86_64 39/590 Installing : libtool-ltdl-2.4.7-6.fc38.x86_64 40/590 Installing : libtdb-1.4.8-1.fc38.x86_64 41/590 Installing : libedit-3.1-45.20221030cvs.fc38.x86_64 42/590 Installing : libICE-1.0.10-10.fc38.x86_64 43/590 Installing : lcms2-2.15-1.fc38.x86_64 44/590 Installing : geos-3.11.1-3.fc38.x86_64 45/590 Installing : cuda-nvrtc-12-3-12.3.107-1.x86_64 46/590 Running scriptlet: cuda-nvrtc-12-3-12.3.107-1.x86_64 46/590 Installing : libcudnn8-8.9.7.29-2.cuda12.3.x86_64 47/590 Installing : pthreadpool-1:0.1-20240121.0.git178e3e06.fc38.x8 48/590 Installing : libSM-1.2.3-12.fc38.x86_64 49/590 Installing : unixODBC-2.3.11-2.fc38.x86_64 50/590 Installing : python3-rpm-macros-3.11-10.fc38.noarch 51/590 Installing : onnx-libs-1.17.0-20240404.0.git4128a090.fc38.x86 52/590 Installing : fftw-libs-quad-3.3.10-10.fc38.x86_64 53/590 Installing : libcufft-12-3-11.0.12.1-2.x86_64 54/590 Running scriptlet: libcufft-12-3-11.0.12.1-2.x86_64 54/590 Installing : libcusparse-12-3-12.2.0.103-2.x86_64 55/590 Running scriptlet: libcusparse-12-3-12.2.0.103-2.x86_64 55/590 Installing : libcurand-12-3-10.3.4.107-1.x86_64 56/590 Running scriptlet: libcurand-12-3-10.3.4.107-1.x86_64 56/590 Installing : openblas-openmp64-0.3.21-4.fc38.x86_64 57/590 Installing : flexiblas-netlib64-3.4.2-1.fc38.x86_64 58/590 Installing : flexiblas-netlib-3.4.2-1.fc38.x86_64 59/590 Installing : flexiblas-openblas-openmp-3.4.2-1.fc38.x86_64 60/590 Installing : flexiblas-3.4.2-1.fc38.x86_64 61/590 Installing : flexiblas-openblas-openmp64-3.4.2-1.fc38.x86_64 62/590 Installing : suitesparse-5.13.0-2.fc38.x86_64 63/590 Installing : hdf-libs-4.2.15-12.fc38.x86_64 64/590 Installing : lpcnetfreedv-0.2-13.fc38.x86_64 65/590 Installing : codec2-1.0.5-2.fc38.x86_64 66/590 Installing : rav1e-libs-0.7.1-1.fc38.x86_64 67/590 Installing : ocl-icd-2.3.2-1.fc38.x86_64 68/590 Installing : mesa-libglapi-23.1.9-1.fc38.x86_64 69/590 Installing : libwayland-server-1.22.0-1.fc38.x86_64 70/590 Installing : libdav1d-1.2.1-1.fc38.x86_64 71/590 Installing : imath-3.1.10-1.fc38.x86_64 72/590 Installing : openexr-libs-3.1.10-1.fc38.x86_64 73/590 Installing : fftw-libs-single-3.3.10-10.fc38.x86_64 74/590 Installing : fftw-libs-long-3.3.10-10.fc38.x86_64 75/590 Installing : fftw-libs-double-3.3.10-10.fc38.x86_64 76/590 Installing : dbus-libs-1:1.14.10-1.fc38.x86_64 77/590 Installing : avahi-libs-0.8-22.fc38.x86_64 78/590 Installing : alsa-lib-1.2.11-2.fc38.x86_64 79/590 Installing : adobe-mappings-cmap-20230622-1.fc38.noarch 80/590 Installing : xorg-x11-proto-devel-2022.2-3.fc38.noarch 81/590 Running scriptlet: xml-common-0.6.3-60.fc38.noarch 82/590 Installing : xml-common-0.6.3-60.fc38.noarch 82/590 Installing : tbb-2020.3-16.fc38.x86_64 83/590 Installing : svt-av1-libs-1.4.1-2.fc38.x86_64 84/590 Installing : pcre2-utf16-10.42-1.fc38.1.x86_64 85/590 Installing : opus-1.3.1-12.fc38.x86_64 86/590 Installing : openpgm-5.2.122-31.fc38.x86_64 87/590 Installing : nettle-3.8-3.fc38.x86_64 88/590 Installing : gnutls-3.8.4-1.fc38.x86_64 89/590 Installing : glib2-2.76.6-1.fc38.x86_64 90/590 Installing : cups-libs-1:2.4.7-11.fc38.x86_64 91/590 Installing : libgudev-237-4.fc38.x86_64 92/590 Installing : shared-mime-info-2.2-3.fc38.x86_64 93/590 Running scriptlet: shared-mime-info-2.2-3.fc38.x86_64 93/590 Installing : gdk-pixbuf2-2.42.10-2.fc38.x86_64 94/590 Installing : lua-posix-35.1-5.fc38.x86_64 95/590 Installing : libxshmfence-1.3-12.fc38.x86_64 96/590 Installing : libsodium-1.0.18-11.fc38.x86_64 97/590 Installing : zeromq-4.3.4-5.fc38.x86_64 98/590 Installing : libnl3-3.7.0-3.fc38.x86_64 99/590 Installing : libibverbs-44.0-3.fc38.x86_64 100/590 Installing : liblerc-4.0.0-3.fc38.x86_64 101/590 Installing : libicu-72.1-2.fc38.x86_64 102/590 Installing : libibumad-44.0-3.fc38.x86_64 103/590 Installing : libaec-1.0.6-4.fc38.x86_64 104/590 Installing : hdf5-1.12.1-11.fc38.x86_64 105/590 Installing : libXau-1.0.11-2.fc38.x86_64 106/590 Installing : libxcb-1.13.1-11.fc38.x86_64 107/590 Installing : jsoncpp-1.9.5-4.fc38.x86_64 108/590 Installing : hiredis-1.0.2-4.fc38.x86_64 109/590 Installing : freexl-1.0.6-21.fc38.x86_64 110/590 Installing : flatbuffers-23.3.3-1.fc38.x86_64 111/590 Installing : double-conversion-3.1.5-8.fc38.x86_64 112/590 Installing : libnccl-2.21.5-1+cuda12.4.x86_64 113/590 Running scriptlet: libnccl-2.21.5-1+cuda12.4.x86_64 113/590 Installing : asmjit-1:0-20220702.1.gitc5984762.fc38.x86_64 114/590 Installing : fbgemm-0.7.0-20240315.0.git0049a2ca.fc38.x86_64 115/590 Installing : gloo-1:0.5.0-20240302.0.git2565674c.cu12_3.fc38. 116/590 Installing : xcb-util-0.4.1-2.fc38.x86_64 117/590 Installing : xcb-util-image-0.4.1-2.fc38.x86_64 118/590 Installing : xcb-util-keysyms-0.4.1-2.fc38.x86_64 119/590 Installing : xcb-util-renderutil-0.3.10-2.fc38.x86_64 120/590 Installing : xcb-util-wm-0.4.2-2.fc38.x86_64 121/590 Installing : libXau-devel-1.0.11-2.fc38.x86_64 122/590 Installing : libxcb-devel-1.13.1-11.fc38.x86_64 123/590 Installing : cgnslib-libs-4.3.0-7.fc38.x86_64 124/590 Installing : librdmacm-44.0-3.fc38.x86_64 125/590 Installing : libsodium-devel-1.0.18-11.fc38.x86_64 126/590 Installing : copy-jdk-configs-4.1-2.fc38.noarch 127/590 Installing : graphene-1.10.6-5.fc38.x86_64 128/590 Installing : srt-libs-1.5.2-1.fc38.x86_64 129/590 Installing : openpgm-devel-5.2.122-31.fc38.x86_64 130/590 Installing : iso-codes-4.13.0-1.fc38.noarch 131/590 Installing : adobe-mappings-cmap-deprecated-20230622-1.fc38.n 132/590 Installing : fftw-3.3.10-10.fc38.x86_64 133/590 Installing : fftw-libs-3.3.10-10.fc38.x86_64 134/590 Installing : glpk-5.0-6.fc38.x86_64 135/590 Installing : coin-or-CoinUtils-2.11.4-9.fc38.x86_64 136/590 Installing : coin-or-Osi-0.108.6-8.fc38.x86_64 137/590 Installing : SuperLU-5.3.0-4.fc38.x86_64 138/590 Installing : arpack-3.8.0-6.fc38.x86_64 139/590 Installing : armadillo-12.8.1-1.fc38.x86_64 140/590 Installing : magma-2.8.0-20240328.0.cu12_3.fc38.x86_64 141/590 Installing : pyproject-rpm-macros-1.12.0-1.fc38.noarch 142/590 Installing : nnpack-0-20230201.0.git70a77f48.fc38.x86_64 143/590 Installing : qnnpack-0-20190828.2.git7d2a4e99.fc38.x86_64 144/590 Installing : librttopo-1.1.0-11.fc38.x86_64 145/590 Installing : llvm15-libs-15.0.7-4.fc38.x86_64 146/590 Installing : llvm-libs-16.0.6-3.fc38.x86_64 147/590 Installing : halide-17.0.1-20240220.0.fc38.x86_64 148/590 Installing : libldb-2.7.2-1.fc38.x86_64 149/590 Installing : libunwind-devel-1.6.2-7.fc38.x86_64 150/590 Installing : lua-term-0.07-17.fc38.x86_64 151/590 Installing : lmdb-0.9.32-1.fc38.x86_64 152/590 Installing : protobuf-compat-compiler-3.21.9-2.fc38.x86_64 153/590 Installing : urw-base35-bookman-fonts-20200910-16.fc38.noarch 154/590 Running scriptlet: urw-base35-bookman-fonts-20200910-16.fc38.noarch 154/590 Installing : urw-base35-c059-fonts-20200910-16.fc38.noarch 155/590 Running scriptlet: urw-base35-c059-fonts-20200910-16.fc38.noarch 155/590 Installing : urw-base35-d050000l-fonts-20200910-16.fc38.noarc 156/590 Running scriptlet: urw-base35-d050000l-fonts-20200910-16.fc38.noarc 156/590 Installing : urw-base35-gothic-fonts-20200910-16.fc38.noarch 157/590 Running scriptlet: urw-base35-gothic-fonts-20200910-16.fc38.noarch 157/590 Installing : urw-base35-nimbus-mono-ps-fonts-20200910-16.fc38 158/590 Running scriptlet: urw-base35-nimbus-mono-ps-fonts-20200910-16.fc38 158/590 Installing : urw-base35-nimbus-roman-fonts-20200910-16.fc38.n 159/590 Running scriptlet: urw-base35-nimbus-roman-fonts-20200910-16.fc38.n 159/590 Installing : urw-base35-nimbus-sans-fonts-20200910-16.fc38.no 160/590 Running scriptlet: urw-base35-nimbus-sans-fonts-20200910-16.fc38.no 160/590 Installing : urw-base35-p052-fonts-20200910-16.fc38.noarch 161/590 Running scriptlet: urw-base35-p052-fonts-20200910-16.fc38.noarch 161/590 Installing : urw-base35-standard-symbols-ps-fonts-20200910-16 162/590 Running scriptlet: urw-base35-standard-symbols-ps-fonts-20200910-16 162/590 Installing : urw-base35-z003-fonts-20200910-16.fc38.noarch 163/590 Running scriptlet: urw-base35-z003-fonts-20200910-16.fc38.noarch 163/590 Installing : urw-base35-fonts-20200910-16.fc38.noarch 164/590 Installing : gflags-devel-2.2.2-11.fc38.x86_64 165/590 Installing : glog-0.3.5-17.fc38.x86_64 166/590 Installing : ceres-solver-2.1.0-5.fc38.x86_64 167/590 Installing : cuda-gcc-12-12.3.1-1.fc38.x86_64 168/590 Installing : cpp-13.2.1-7.fc38.x86_64 169/590 Installing : tensorpipe-0-20220513.1.gitbb1473a4.fc37.x86_64 170/590 Installing : libuv-static-1:1.48.0-1.fc38.x86_64 171/590 Installing : libuv-devel-1:1.48.0-1.fc38.x86_64 172/590 Installing : nss-softokn-freebl-3.99.0-1.fc38.x86_64 173/590 Installing : nss-softokn-3.99.0-1.fc38.x86_64 174/590 Installing : mesa-libGLU-9.0.3-1.fc38.x86_64 175/590 Installing : leveldb-1.23-6.fc38.x86_64 176/590 Installing : blosc-1.21.5-2.fc38.x86_64 177/590 Installing : netcdf-4.9.0-5.fc38.x86_64 178/590 Installing : libwayland-cursor-1.22.0-1.fc38.x86_64 179/590 Installing : libcusolver-12-3-11.5.4.101-2.x86_64 180/590 Running scriptlet: libcusolver-12-3-11.5.4.101-2.x86_64 180/590 Installing : libnpp-12-3-12.2.3.2-2.x86_64 181/590 Running scriptlet: libnpp-12-3-12.2.3.2-2.x86_64 181/590 Installing : libnvjitlink-12-3-12.3.101-1.x86_64 182/590 Running scriptlet: libnvjitlink-12-3-12.3.101-1.x86_64 182/590 Installing : openblas-openmp64_-0.3.21-4.fc38.x86_64 183/590 Installing : openblas-serial-0.3.21-4.fc38.x86_64 184/590 Installing : openblas-serial64-0.3.21-4.fc38.x86_64 185/590 Installing : openblas-serial64_-0.3.21-4.fc38.x86_64 186/590 Installing : openblas-threads-0.3.21-4.fc38.x86_64 187/590 Installing : openblas-threads64-0.3.21-4.fc38.x86_64 188/590 Installing : openblas-threads64_-0.3.21-4.fc38.x86_64 189/590 Installing : ogdi-4.1.0-10.fc38.x86_64 190/590 Installing : libharu-2.4.3-2.fc38.x86_64 191/590 Installing : zvbi-0.2.35-19.fc38.x86_64 192/590 Running scriptlet: zvbi-0.2.35-19.fc38.x86_64 192/590 Installing : uriparser-0.9.7-2.fc38.x86_64 193/590 Installing : libkml-1.3.0-43.fc38.x86_64 194/590 Installing : zimg-3.0.5-1.fc38.x86_64 195/590 Installing : xerces-c-3.2.5-1.fc38.x86_64 196/590 Installing : xapian-core-libs-1.4.23-1.fc38.x86_64 197/590 Installing : vim-filesystem-2:9.1.264-1.fc38.noarch 198/590 Installing : tzdata-java-2024a-1.fc38.noarch 199/590 Installing : qt-settings-38.3-1.fc38.noarch 200/590 Installing : python-pip-wheel-22.3.1-3.fc38.noarch 201/590 Installing : procps-ng-3.3.17-11.fc38.x86_64 202/590 Installing : openssh-9.0p1-19.fc38.x86_64 203/590 Installing : opencl-headers-3.0-18.20231003git9ce9a72.fc38.no 204/590 Installing : ncurses-6.4-7.20230520.fc38.1.x86_64 205/590 Installing : minizip-ng-3.0.7-4.fc38.x86_64 206/590 Installing : mesa-filesystem-23.1.9-1.fc38.x86_64 207/590 Installing : mbedtls-2.28.7-1.fc38.x86_64 208/590 Installing : mariadb-connector-c-config-3.3.8-1.fc38.noarch 209/590 Installing : mariadb-connector-c-3.3.8-1.fc38.x86_64 210/590 Installing : libwayland-egl-1.22.0-1.fc38.x86_64 211/590 Installing : libwacom-data-2.8.0-1.fc38.noarch 212/590 Installing : libvpx-1.13.1-1.fc38.x86_64 213/590 Installing : libusb1-1.0.27-1.fc38.x86_64 214/590 Installing : liburing-2.4-2.fc38.x86_64 215/590 Installing : rocksdb-7.8.3-1.fc38.x86_64 216/590 Installing : libstdc++-devel-13.2.1-7.fc38.x86_64 217/590 Installing : libpq-15.3-1.fc38.x86_64 218/590 Installing : libkadm5-1.21-3.fc38.x86_64 219/590 Installing : libgpg-error-1.47-1.fc38.x86_64 220/590 Installing : libgcrypt-1.10.2-1.fc38.x86_64 221/590 Installing : libevdev-1.13.1-1.fc38.x86_64 222/590 Installing : libX11-common-1.8.7-1.fc38.noarch 223/590 Installing : libX11-1.8.7-1.fc38.x86_64 224/590 Installing : libXext-1.3.5-2.fc38.x86_64 225/590 Installing : libXrender-0.9.11-2.fc38.x86_64 226/590 Installing : libXfixes-6.0.0-5.fc38.x86_64 227/590 Installing : libXcursor-1.2.1-3.fc38.x86_64 228/590 Installing : libXv-1.0.11-18.fc38.x86_64 229/590 Installing : libXxf86vm-1.1.5-2.fc38.x86_64 230/590 Installing : libvdpau-1.5-3.fc38.x86_64 231/590 Installing : libXi-1.8.1-1.fc38.x86_64 232/590 Installing : libXt-1.2.1-4.fc38.x86_64 233/590 Installing : libX11-devel-1.8.7-1.fc38.x86_64 234/590 Installing : libXpm-3.5.17-1.fc38.x86_64 235/590 Installing : less-633-1.fc38.x86_64 236/590 Installing : keyutils-libs-devel-1.6.3-1.fc38.x86_64 237/590 Installing : kernel-headers-6.8.3-100.fc38.x86_64 238/590 Installing : json-c-0.17-1.fc38.x86_64 239/590 Installing : hwdata-0.380-1.fc38.noarch 240/590 Installing : libpciaccess-0.16-8.fc38.x86_64 241/590 Installing : libdrm-2.4.120-1.fc38.x86_64 242/590 Installing : mesa-libgbm-23.1.9-1.fc38.x86_64 243/590 Installing : libglvnd-egl-1:1.6.0-2.fc38.x86_64 244/590 Installing : mesa-libEGL-23.1.9-1.fc38.x86_64 245/590 Installing : libvpl-1:2.10.2-1.fc38.x86_64 246/590 Installing : libglvnd-gles-1:1.6.0-2.fc38.x86_64 247/590 Installing : libglvnd-glx-1:1.6.0-2.fc38.x86_64 248/590 Installing : mesa-libGL-23.1.9-1.fc38.x86_64 249/590 Installing : libva-2.18.0-1.fc38.x86_64 250/590 Installing : libavutil-free-6.0.1-2.fc38.x86_64 251/590 Installing : libswscale-free-6.0.1-2.fc38.x86_64 252/590 Installing : glx-utils-8.5.0-1.fc38.x86_64 253/590 Installing : libGLEW-2.2.0-4.fc38.x86_64 254/590 Installing : highway-1.1.0-1.fc38.x86_64 255/590 Installing : libjxl-1:0.7.0-6.fc38.x86_64 256/590 Installing : google-noto-fonts-common-20230201-2.fc38.noarch 257/590 Installing : google-noto-sans-vf-fonts-20230201-2.fc38.noarch 258/590 Installing : google-droid-sans-fonts-20200215-15.fc38.noarch 259/590 Installing : langpacks-core-font-en-3.0-32.fc38.noarch 260/590 Installing : glibc-headers-x86-2.37-18.fc38.noarch 261/590 Installing : libxcrypt-devel-4.4.36-1.fc38.x86_64 262/590 Installing : glibc-devel-2.37-18.fc38.x86_64 263/590 Installing : giflib-5.2.2-1.fc38.x86_64 264/590 Installing : emacs-filesystem-1:29.3-1.fc38.noarch 265/590 Installing : dbus-common-1:1.14.10-1.fc38.noarch 266/590 Running scriptlet: dbus-common-1:1.14.10-1.fc38.noarch 266/590 Running scriptlet: dbus-broker-33-1.fc38.x86_64 267/590 Installing : dbus-broker-33-1.fc38.x86_64 267/590 Running scriptlet: dbus-broker-33-1.fc38.x86_64 267/590 Installing : dbus-1:1.14.10-1.fc38.x86_64 268/590 Installing : clang15-resource-filesystem-15.0.7-5.fc38.x86_64 269/590 Installing : clang15-libs-15.0.7-5.fc38.x86_64 270/590 Installing : annobin-docs-12.40-1.fc38.noarch 271/590 Installing : zlib-devel-1.2.13-3.fc38.x86_64 272/590 Installing : xvidcore-1.3.7-9.fc38.x86_64 273/590 Installing : xkeyboard-config-2.38-1.fc38.noarch 274/590 Installing : libxkbcommon-1.5.0-2.fc38.x86_64 275/590 Installing : libxkbcommon-x11-1.5.0-2.fc38.x86_64 276/590 Installing : vo-amrwbenc-0.1.3-18.fc38.x86_64 277/590 Installing : twolame-libs-0.4.0-2.fc38.x86_64 278/590 Installing : tcl-1:8.6.12-4.fc38.x86_64 279/590 Installing : speex-1.2.0-13.fc38.x86_64 280/590 Installing : soxr-0.1.3-13.fc38.x86_64 281/590 Installing : libswresample-free-6.0.1-2.fc38.x86_64 282/590 Installing : scotch-6.1.2-3.fc37.x86_64 283/590 Installing : rhash-1.4.3-2.fc38.x86_64 284/590 Installing : python-setuptools-wheel-65.5.1-2.fc38.noarch 285/590 Installing : pugixml-1.13-2.fc38.x86_64 286/590 Installing : protobuf-3.19.6-2.fc38.x86_64 287/590 Installing : proj-data-9.1.1-1.fc38.noarch 288/590 Installing : poppler-data-0.4.11-4.fc38.noarch 289/590 Installing : pixman-0.42.2-1.fc38.x86_64 290/590 Installing : pcre2-utf32-10.42-1.fc38.1.x86_64 291/590 Installing : pcre2-devel-10.42-1.fc38.1.x86_64 292/590 Installing : pcre-8.45-1.fc38.3.x86_64 293/590 Installing : gklib-5.1.1-20230326.0.git8bd6bad7.fc38.x86_64 294/590 Installing : metis-5.2.1-20230403.0.gite0f1b88b.fc38.x86_64 295/590 Installing : orc-0.4.33-2.fc38.x86_64 296/590 Installing : opencore-amr-0.1.6-3.fc38.x86_64 297/590 Installing : numactl-libs-2.0.16-2.fc38.x86_64 298/590 Installing : netpbm-11.02.00-1.fc38.x86_64 299/590 Installing : gts-0.7.6-44.20121130.fc38.x86_64 300/590 Installing : mtdev-1.1.6-5.fc38.x86_64 301/590 Installing : mpg123-libs-1.31.3-1.fc38.x86_64 302/590 Installing : libopenmpt-0.6.12-1.fc38.x86_64 303/590 Installing : mpdecimal-2.5.1-6.fc38.x86_64 304/590 Installing : miniz-3.0.2-2.fc38.x86_64 305/590 Installing : lua-lpeg-1.0.2-10.fc38.x86_64 306/590 Installing : lua-json-1.3.4-3.fc38.noarch 307/590 Installing : lua-filesystem-1.8.0-8.fc38.x86_64 308/590 Installing : Lmod-8.7.32-1.fc38.x86_64 309/590 Running scriptlet: Lmod-8.7.32-1.fc38.x86_64 309/590 Installing : lksctp-tools-1.0.19-3.fc38.x86_64 310/590 Installing : libyaml-0.2.5-9.fc38.x86_64 311/590 Installing : libvmaf-2.3.0-5.fc38.x86_64 312/590 Installing : libaom-3.8.2-1.fc38.x86_64 313/590 Installing : libavif-0.11.1-7.fc38.x86_64 314/590 Installing : libvisual-1:0.4.1-1.fc38.x86_64 315/590 Installing : libverto-devel-0.3.2-5.fc38.x86_64 316/590 Installing : libudfread-1.1.2-5.fc38.x86_64 317/590 Installing : libsepol-devel-3.5-1.fc38.x86_64 318/590 Installing : libselinux-devel-3.5-1.fc38.x86_64 319/590 Installing : libseccomp-2.5.3-4.fc38.x86_64 320/590 Installing : libraw1394-2.1.2-17.fc38.x86_64 321/590 Installing : libdc1394-2.2.6-9.fc38.x86_64 322/590 Installing : librabbitmq-0.13.0-1.fc38.x86_64 323/590 Installing : libqhull_r-1:7.2.1-12.fc38.x86_64 324/590 Installing : libproxy-0.4.18-6.fc38.x86_64 325/590 Installing : qt5-qtbase-common-5.15.12-5.fc38.noarch 326/590 Running scriptlet: qt5-qtbase-5.15.12-5.fc38.x86_64 327/590 Installing : qt5-qtbase-5.15.12-5.fc38.x86_64 327/590 Running scriptlet: qt5-qtbase-5.15.12-5.fc38.x86_64 327/590 Installing : libpaper-1:2.0.8-1.fc38.x86_64 328/590 Installing : libmodplug-1:0.8.9.0-16.fc38.x86_64 329/590 Installing : libimagequant-2.17.0-4.fc38.x86_64 330/590 Installing : libijs-0.35-17.fc38.x86_64 331/590 Installing : libgta-1.2.1-9.fc38.x86_64 332/590 Installing : libglvnd-core-devel-1:1.6.0-2.fc38.x86_64 333/590 Installing : libglvnd-devel-1:1.6.0-2.fc38.x86_64 334/590 Installing : libfontenc-1.1.6-2.fc38.x86_64 335/590 Installing : libdatrie-0.2.13-5.fc38.x86_64 336/590 Installing : libthai-0.1.29-4.fc38.x86_64 337/590 Installing : libcom_err-devel-1.46.5-4.fc38.x86_64 338/590 Installing : krb5-devel-1.21-3.fc38.x86_64 339/590 Installing : libcbor-0.7.0-9.fc38.x86_64 340/590 Installing : libfido2-1.12.0-3.fc38.x86_64 341/590 Installing : openssh-clients-9.0p1-19.fc38.x86_64 342/590 Running scriptlet: openssh-clients-9.0p1-19.fc38.x86_64 342/590 Installing : git-core-2.44.0-1.fc38.x86_64 343/590 Installing : git-core-doc-2.44.0-1.fc38.noarch 344/590 Installing : libb2-0.98.1-8.fc38.x86_64 345/590 Installing : python3-3.11.8-2.fc38.x86_64 346/590 Installing : python3-libs-3.11.8-2.fc38.x86_64 347/590 Installing : gstreamer1-1.22.9-1.fc38.x86_64 348/590 Installing : cmake-rpm-macros-3.27.7-1.fc38.noarch 349/590 Installing : vapoursynth-libs-58-4.fc38.x86_64 350/590 Installing : onnx-optimizer-0.3.19-20240303.0.gitb3a46118.fc3 351/590 Installing : crypto-policies-scripts-20230301-1.gita12f7b2.fc 352/590 Installing : nss-sysinit-3.99.0-1.fc38.x86_64 353/590 Installing : nss-3.99.0-1.fc38.x86_64 354/590 Running scriptlet: nss-3.99.0-1.fc38.x86_64 354/590 Installing : java-17-openjdk-headless-1:17.0.9.0.9-3.fc38.x86 355/590 Running scriptlet: java-17-openjdk-headless-1:17.0.9.0.9-3.fc38.x86 355/590 Installing : byte-buddy-agent-1.12.10-3.fc38.noarch 356/590 Installing : javapackages-tools-6.1.0-7.fc38.noarch 357/590 Installing : objectweb-asm-9.3-5.fc38.noarch 358/590 Installing : byte-buddy-1.12.10-3.fc38.noarch 359/590 Installing : objenesis-3.3-2.fc38.noarch 360/590 Installing : opentest4j-1.2.0-12.fc38.noarch 361/590 Installing : mockito-3.12.4-6.fc38.noarch 362/590 Installing : jacop-4.9.0-1.fc38.noarch 363/590 Installing : python3-packaging-23.0-1.fc38.noarch 364/590 Installing : python3-rpm-generators-14-4.fc38.noarch 365/590 Installing : python3-six-1.16.0-9.fc38.noarch 366/590 Installing : libwacom-2.8.0-1.fc38.x86_64 367/590 Installing : libinput-1.23.0-2.fc38.x86_64 368/590 Running scriptlet: libinput-1.23.0-2.fc38.x86_64 368/590 Installing : lame-libs-3.100-14.fc38.x86_64 369/590 Installing : kmod-libs-30-4.fc38.x86_64 370/590 Installing : systemd-pam-253.17-1.fc38.x86_64 371/590 Installing : systemd-253.17-1.fc38.x86_64 372/590 Running scriptlet: systemd-253.17-1.fc38.x86_64 372/590 Creating group 'input' with GID 104. Creating group 'kvm' with GID 36. Creating group 'render' with GID 105. Creating group 'sgx' with GID 106. Creating group 'systemd-journal' with GID 190. Creating group 'systemd-oom' with GID 999. Creating user 'systemd-oom' (systemd Userspace OOM Killer) with UID 999 and GID 999. Running scriptlet: samba-common-2:4.18.11-1.fc38.noarch 373/590 Installing : samba-common-2:4.18.11-1.fc38.noarch 373/590 Running scriptlet: samba-common-2:4.18.11-1.fc38.noarch 373/590 Running scriptlet: libwbclient-2:4.18.11-1.fc38.x86_64 374/590 Installing : libwbclient-2:4.18.11-1.fc38.x86_64 374/590 Installing : samba-client-libs-2:4.18.11-1.fc38.x86_64 375/590 Installing : samba-common-libs-2:4.18.11-1.fc38.x86_64 376/590 Installing : libsmbclient-2:4.18.11-1.fc38.x86_64 377/590 Installing : jbigkit-libs-2.1-25.fc38.x86_64 378/590 Installing : libtiff-4.4.0-8.fc38.x86_64 379/590 Installing : proj-9.1.1-1.fc38.x86_64 380/590 Installing : libgeotiff-1.7.1-6.fc38.x86_64 381/590 Installing : libspatialite-5.0.1-20.fc38.x86_64 382/590 Installing : gdk-pixbuf2-modules-2.42.10-2.fc38.x86_64 383/590 Installing : jbig2dec-libs-0.19-8.fc38.x86_64 384/590 Installing : isl-0.16.1-17.fc38.x86_64 385/590 Installing : ilbc-3.0.4-4.fc38.x86_64 386/590 Installing : gsm-1.0.22-2.fc38.x86_64 387/590 Installing : gsl-2.7.1-4.fc38.x86_64 388/590 Running scriptlet: groff-base-1.22.4-11.fc38.x86_64 389/590 Installing : groff-base-1.22.4-11.fc38.x86_64 389/590 Running scriptlet: groff-base-1.22.4-11.fc38.x86_64 389/590 Installing : perl-Digest-1.20-490.fc38.noarch 390/590 Installing : perl-Digest-MD5-2.58-490.fc38.x86_64 391/590 Installing : perl-B-1.83-498.fc38.x86_64 392/590 Installing : perl-FileHandle-2.03-498.fc38.noarch 393/590 Installing : perl-Data-Dumper-2.184-491.fc38.x86_64 394/590 Installing : perl-libnet-3.15-1.fc38.noarch 395/590 Installing : perl-AutoLoader-5.74-498.fc38.noarch 396/590 Installing : perl-base-2.27-498.fc38.noarch 397/590 Installing : perl-URI-5.17-2.fc38.noarch 398/590 Installing : perl-Time-Local-2:1.300-490.fc38.noarch 399/590 Installing : perl-Mozilla-CA-20221114-2.fc38.noarch 400/590 Installing : perl-Text-Tabs+Wrap-2023.0511-1.fc38.noarch 401/590 Installing : perl-if-0.61.000-498.fc38.noarch 402/590 Installing : perl-locale-1.10-498.fc38.noarch 403/590 Installing : perl-IO-Socket-IP-0.41-492.fc38.noarch 404/590 Installing : perl-File-Path-2.18-490.fc38.noarch 405/590 Installing : perl-IO-Socket-SSL-2.081-1.fc38.noarch 406/590 Installing : perl-Net-SSLeay-1.92-5.fc38.x86_64 407/590 Installing : perl-Pod-Escapes-1:1.07-490.fc38.noarch 408/590 Installing : perl-Term-ANSIColor-5.01-491.fc38.noarch 409/590 Installing : perl-Class-Struct-0.66-498.fc38.noarch 410/590 Installing : perl-POSIX-2.03-498.fc38.x86_64 411/590 Installing : perl-IPC-Open3-1.22-498.fc38.noarch 412/590 Installing : perl-File-Temp-1:0.231.100-490.fc38.noarch 413/590 Installing : perl-HTTP-Tiny-0.086-2.fc38.noarch 414/590 Installing : perl-Term-Cap-1.18-1.fc38.noarch 415/590 Installing : perl-Pod-Simple-1:3.43-491.fc38.noarch 416/590 Installing : perl-Socket-4:2.036-2.fc38.x86_64 417/590 Installing : perl-SelectSaver-1.02-498.fc38.noarch 418/590 Installing : perl-Symbol-1.09-498.fc38.noarch 419/590 Installing : perl-File-stat-1.12-498.fc38.noarch 420/590 Installing : perl-podlators-1:5.01-2.fc38.noarch 421/590 Installing : perl-Pod-Perldoc-3.28.01-491.fc38.noarch 422/590 Installing : perl-Text-ParseWords-3.31-490.fc38.noarch 423/590 Installing : perl-Fcntl-1.15-498.fc38.x86_64 424/590 Installing : perl-mro-1.26-498.fc38.x86_64 425/590 Installing : perl-IO-1.50-498.fc38.x86_64 426/590 Installing : perl-overloading-0.02-498.fc38.noarch 427/590 Installing : perl-Pod-Usage-4:2.03-4.fc38.noarch 428/590 Installing : perl-MIME-Base64-3.16-490.fc38.x86_64 429/590 Installing : perl-Scalar-List-Utils-5:1.63-490.fc38.x86_64 430/590 Installing : perl-constant-1.33-491.fc38.noarch 431/590 Installing : perl-parent-1:0.241-1.fc38.noarch 432/590 Installing : perl-Errno-1.36-498.fc38.x86_64 433/590 Installing : perl-File-Basename-2.85-498.fc38.noarch 434/590 Installing : perl-Getopt-Std-1.13-498.fc38.noarch 435/590 Installing : perl-Storable-1:3.26-490.fc38.x86_64 436/590 Installing : perl-overload-1.35-498.fc38.noarch 437/590 Installing : perl-vars-1.05-498.fc38.noarch 438/590 Installing : perl-Getopt-Long-1:2.54-2.fc38.noarch 439/590 Installing : perl-Carp-1.52-490.fc38.noarch 440/590 Installing : perl-Exporter-5.77-490.fc38.noarch 441/590 Installing : perl-PathTools-3.84-490.fc38.x86_64 442/590 Installing : perl-DynaLoader-1.52-498.fc38.x86_64 443/590 Installing : perl-Encode-4:3.19-493.fc38.x86_64 444/590 Installing : perl-libs-4:5.36.3-498.fc38.x86_64 445/590 Installing : perl-interpreter-4:5.36.3-498.fc38.x86_64 446/590 Installing : infiniband-diags-44.0-3.fc38.x86_64 447/590 Installing : perl-Error-1:0.17029-11.fc38.noarch 448/590 Installing : perl-TermReadKey-2.38-16.fc38.x86_64 449/590 Installing : perl-File-Find-1.40-498.fc38.noarch 450/590 Installing : perl-lib-0.65-498.fc38.x86_64 451/590 Installing : perl-Git-2.44.0-1.fc38.noarch 452/590 Installing : git-2.44.0-1.fc38.x86_64 453/590 Installing : graphite2-1.3.14-11.fc38.x86_64 454/590 Installing : cairo-1.17.8-4.fc38.x86_64 455/590 Installing : harfbuzz-7.1.0-1.fc38.x86_64 456/590 Installing : freetype-2.13.0-2.fc38.x86_64 457/590 Installing : fontconfig-2.14.2-2.fc38.x86_64 458/590 Running scriptlet: fontconfig-2.14.2-2.fc38.x86_64 458/590 Installing : qt5-qtbase-gui-5.15.12-5.fc38.x86_64 459/590 Installing : poppler-23.02.0-3.fc38.x86_64 460/590 Installing : poppler-glib-23.02.0-3.fc38.x86_64 461/590 Installing : gecode-6.2.0-11.fc38.x86_64 462/590 Installing : mp-3.1.0-41.20200303git7fd4828.fc38.x86_64 463/590 Installing : gd-2.3.3-10.fc38.x86_64 464/590 Installing : libbluray-1.3.4-2.fc38.x86_64 465/590 Installing : libXft-2.3.8-2.fc38.x86_64 466/590 Installing : mkfontscale-1.2.2-3.fc38.x86_64 467/590 Installing : xorg-x11-fonts-ISO8859-1-100dpi-7.5-35.fc38.noar 468/590 Running scriptlet: xorg-x11-fonts-ISO8859-1-100dpi-7.5-35.fc38.noar 468/590 Installing : openslide-3.4.1-23.fc38.x86_64 469/590 Installing : cairo-gobject-1.17.8-4.fc38.x86_64 470/590 Installing : gmp-c++-1:6.2.1-4.fc38.x86_64 471/590 Installing : gmp-devel-1:6.2.1-4.fc38.x86_64 472/590 Installing : gl-manpages-1.1-26.20190306.fc38.noarch 473/590 Installing : gc-8.2.2-3.fc38.x86_64 474/590 Installing : guile22-2.2.7-7.fc38.x86_64 475/590 Installing : make-1:4.4.1-1.fc38.x86_64 476/590 Installing : gcc-13.2.1-7.fc38.x86_64 477/590 Running scriptlet: gcc-13.2.1-7.fc38.x86_64 477/590 Installing : cmake-data-3.27.7-1.fc38.noarch 478/590 Installing : cmake-3.27.7-1.fc38.x86_64 479/590 Installing : pybind11-devel-2.10.3-2.fc38.x86_64 480/590 Installing : gcc-c++-13.2.1-7.fc38.x86_64 481/590 Installing : game-music-emu-0.6.3-11.fc38.x86_64 482/590 Installing : fribidi-1.0.12-3.fc38.x86_64 483/590 Installing : pango-1.50.14-1.fc38.x86_64 484/590 Installing : librsvg2-2.56.4-1.fc38.x86_64 485/590 Installing : lasi-1.1.3-10.fc38.x86_64 486/590 Installing : fdk-aac-free-2.0.0-10.fc38.x86_64 487/590 Installing : libavcodec-free-6.0.1-2.fc38.x86_64 488/590 Installing : libchromaprint-1.5.1-8.fc38.x86_64 489/590 Installing : cliquer-libs-1.22-5.fc38.x86_64 490/590 Installing : libnauty-2.8.6-5.fc38.x86_64 491/590 Installing : cjson-1.7.14-7.fc38.x86_64 492/590 Running scriptlet: cjson-1.7.14-7.fc38.x86_64 492/590 Installing : librist-0.2.7-1.fc38.x86_64 493/590 Installing : libavformat-free-6.0.1-2.fc38.x86_64 494/590 Installing : cfitsio-4.2.0-3.fc38.x86_64 495/590 Installing : gdal-libs-3.6.4-2.fc38.x86_64 496/590 Installing : vtk-9.2.5-2.fc38.x86_64 497/590 Installing : cdparanoia-libs-10.2-41.fc38.x86_64 498/590 Installing : gstreamer1-plugins-base-1.22.9-1.fc38.x86_64 499/590 Installing : adobe-mappings-pdf-20190401-3.fc38.noarch 500/590 Installing : libgs-10.02.1-2.fc38.x86_64 501/590 Installing : graphviz-7.1.0-3.fc38.x86_64 502/590 Running scriptlet: graphviz-7.1.0-3.fc38.x86_64 502/590 Installing : MUMPS-common-5.5.1-1.fc38.noarch 503/590 Installing : MUMPS-5.5.1-1.fc38.x86_64 504/590 Installing : coin-or-Cbc-2.10.5-12.fc38.x86_64 505/590 Installing : coin-or-Clp-1.17.6-12.fc38.x86_64 506/590 Installing : coin-or-Cgl-0.60.3-9.fc38.x86_64 507/590 Installing : opencv-contrib-4.9.0-20231227.1.cu12_3.fc38.x86_ 508/590 Installing : opencv-core-4.9.0-20231227.1.cu12_3.fc38.x86_64 509/590 Installing : opencv-cuda-4.9.0-20231227.1.cu12_3.fc38.x86_64 510/590 Installing : opencv-4.9.0-20231227.1.cu12_3.fc38.x86_64 511/590 Installing : opencv-static-4.9.0-20231227.1.cu12_3.fc38.x86_6 512/590 Installing : opencv-devel-4.9.0-20231227.1.cu12_3.fc38.x86_64 513/590 Installing : cuda-nvvm-12-3-12.3.107-1.x86_64 514/590 Installing : cuda-nvtx-12-3-12.3.101-1.x86_64 515/590 Installing : cuda-driver-devel-12-3-12.3.101-1.x86_64 516/590 Installing : cuda-cupti-12-3-12.3.101-1.x86_64 517/590 Installing : kineto-0.4.0-20240327.0.git445909a8.cu12_3.fc38. 518/590 Installing : cuda-crt-12-3-12.3.107-1.x86_64 519/590 Installing : cuda-nvcc-12-3-12.3.107-1.x86_64 520/590 Installing : cutlass-3.4.1-20240215.0.cu12_3.fc38.x86_64 521/590 Installing : cuda-cccl-12-3-12.3.101-1.x86_64 522/590 Installing : sleef-3.6-20240320.0.git60e76d2b.fc38.x86_64 523/590 Installing : fp16-1:0-20240410.0.git581ac1c7.fc38.x86_64 524/590 Installing : foxi-0-20210526.1.gitc278588e.fc37.x86_64 525/590 Installing : foxi-devel-0-20210526.1.gitc278588e.fc37.x86_64 526/590 Installing : fp16-devel-1:0-20240410.0.git581ac1c7.fc38.x86_6 527/590 Installing : sleef-devel-3.6-20240320.0.git60e76d2b.fc38.x86_ 528/590 Installing : cuda-cudart-devel-12-3-12.3.101-1.x86_64 529/590 Installing : cutlass-devel-3.4.1-20240215.0.cu12_3.fc38.x86_6 530/590 Installing : kineto-devel-0.4.0-20240327.0.git445909a8.cu12_3 531/590 Installing : doxygen-2:1.9.6-7.fc38.x86_64 532/590 Installing : python3-pybind11-2.10.3-2.fc38.x86_64 533/590 Installing : annobin-plugin-gcc-12.40-1.fc38.x86_64 534/590 Running scriptlet: annobin-plugin-gcc-12.40-1.fc38.x86_64 534/590 Installing : gcc-plugin-annobin-13.2.1-7.fc38.x86_64 535/590 Running scriptlet: gcc-plugin-annobin-13.2.1-7.fc38.x86_64 535/590 Installing : mesa-libGLU-devel-9.0.3-1.fc38.x86_64 536/590 Installing : mpfr-devel-4.1.1-3.fc38.x86_64 537/590 Installing : rdma-core-devel-44.0-3.fc38.x86_64 538/590 Installing : cuda-gcc-12-c++-12.3.1-1.fc38.x86_64 539/590 Installing : peachpy-python3-0-20221113.1.git349e8f83.fc38.no 540/590 Installing : python3-devel-3.11.8-2.fc38.x86_64 541/590 Installing : onnx-optimizer-devel-0.3.19-20240303.0.gitb3a461 542/590 Installing : python3-pyyaml-6.0-6.fc38.x86_64 543/590 Installing : python3-setuptools-65.5.1-2.fc38.noarch 544/590 Installing : python3-typing-extensions-4.5.0-1.fc38.noarch 545/590 Installing : python3-numpy-1:1.24.4-1.fc38.x86_64 546/590 Installing : zeromq-devel-4.3.4-5.fc38.x86_64 547/590 Installing : miniz-devel-3.0.2-2.fc38.x86_64 548/590 Installing : numactl-devel-2.0.16-2.fc38.x86_64 549/590 Installing : protobuf-compat-devel-3.21.9-2.fc38.x86_64 550/590 Installing : rocksdb-devel-7.8.3-1.fc38.x86_64 551/590 Installing : ocl-icd-devel-2.3.2-1.fc38.x86_64 552/590 Installing : openblas-devel-0.3.21-4.fc38.x86_64 553/590 Installing : libnvjitlink-devel-12-3-12.3.101-1.x86_64 554/590 Installing : libcusolver-devel-12-3-11.5.4.101-2.x86_64 555/590 Installing : leveldb-devel-1.23-6.fc38.x86_64 556/590 Installing : tensorpipe-devel-0-20220513.1.gitbb1473a4.fc37.x 557/590 Installing : glog-devel-0.3.5-17.fc38.x86_64 558/590 Installing : lmdb-devel-0.9.32-1.fc38.x86_64 559/590 Installing : qnnpack-devel-0-20190828.2.git7d2a4e99.fc38.x86_ 560/590 Installing : nnpack-devel-0-20230201.0.git70a77f48.fc38.x86_6 561/590 Installing : magma-devel-2.8.0-20240328.0.cu12_3.fc38.x86_64 562/590 Installing : fftw-devel-3.3.10-10.fc38.x86_64 563/590 Installing : gloo-devel-1:0.5.0-20240302.0.git2565674c.cu12_3 564/590 Installing : fbgemm-devel-0.7.0-20240315.0.git0049a2ca.fc38.x 565/590 Installing : asmjit-devel-1:0-20220702.1.gitc5984762.fc38.x86 566/590 Installing : libnccl-devel-2.21.5-1+cuda12.4.x86_64 567/590 Running scriptlet: libnccl-devel-2.21.5-1+cuda12.4.x86_64 567/590 Installing : flatbuffers-compiler-23.3.3-1.fc38.x86_64 568/590 Installing : flatbuffers-devel-23.3.3-1.fc38.x86_64 569/590 Installing : hiredis-devel-1.0.2-4.fc38.x86_64 570/590 Installing : tbb-devel-2020.3-16.fc38.x86_64 571/590 Installing : libcurand-devel-12-3-10.3.4.107-1.x86_64 572/590 Installing : libcusparse-devel-12-3-12.2.0.103-2.x86_64 573/590 Installing : libcufft-devel-12-3-11.0.12.1-2.x86_64 574/590 Installing : onnx-devel-1.17.0-20240404.0.git4128a090.fc38.x8 575/590 Installing : pthreadpool-devel-1:0.1-20240121.0.git178e3e06.f 576/590 Installing : libcudnn8-devel-8.9.7.29-2.cuda12.3.x86_64 577/590 Running scriptlet: libcudnn8-devel-8.9.7.29-2.cuda12.3.x86_64 577/590 Installing : cuda-nvrtc-devel-12-3-12.3.107-1.x86_64 578/590 Installing : cpuinfo-devel-1:0-20240327.0.gitf42f5eaf.fc38.x8 579/590 Installing : snappy-devel-1.1.9-7.fc38.x86_64 580/590 Installing : libcublas-devel-12-3-12.3.4.1-2.x86_64 581/590 Installing : neon2sse-devel-0-20230131.0.git097a5eca.fc38.noa 582/590 Installing : eigen3-devel-3.4.0-9.fc38.noarch 583/590 Installing : systemd-rpm-macros-253.17-1.fc38.noarch 584/590 Installing : libzstd-devel-1.5.5-1.fc38.x86_64 585/590 Installing : cuda-profiler-api-12-3-12.3.101-1.x86_64 586/590 Installing : cuda-nvml-devel-12-3-12.3.101-1.x86_64 587/590 Installing : psimd-devel-1:0-20200517.2.git072586a7.fc38.noar 588/590 Installing : gemmlowp-devel-0-20231104.0.git16e8662c.fc38.noa 589/590 Installing : fxdiv-devel-1:0-20201208.1.git63058eff.fc38.noar 590/590 Running scriptlet: cuda-toolkit-12-3-config-common-12.3.101-1.noarc 590/590 Running scriptlet: copy-jdk-configs-4.1-2.fc38.noarch 590/590 Running scriptlet: urw-base35-bookman-fonts-20200910-16.fc38.noarch 590/590 Running scriptlet: urw-base35-c059-fonts-20200910-16.fc38.noarch 590/590 Running scriptlet: urw-base35-d050000l-fonts-20200910-16.fc38.noarc 590/590 Running scriptlet: urw-base35-gothic-fonts-20200910-16.fc38.noarch 590/590 Running scriptlet: urw-base35-nimbus-mono-ps-fonts-20200910-16.fc38 590/590 Running scriptlet: urw-base35-nimbus-roman-fonts-20200910-16.fc38.n 590/590 Running scriptlet: urw-base35-nimbus-sans-fonts-20200910-16.fc38.no 590/590 Running scriptlet: urw-base35-p052-fonts-20200910-16.fc38.noarch 590/590 Running scriptlet: urw-base35-standard-symbols-ps-fonts-20200910-16 590/590 Running scriptlet: urw-base35-z003-fonts-20200910-16.fc38.noarch 590/590 Running scriptlet: crypto-policies-scripts-20230301-1.gita12f7b2.fc 590/590 Running scriptlet: nss-3.99.0-1.fc38.x86_64 590/590 Running scriptlet: java-17-openjdk-headless-1:17.0.9.0.9-3.fc38.x86 590/590 Running scriptlet: fontconfig-2.14.2-2.fc38.x86_64 590/590 Running scriptlet: fxdiv-devel-1:0-20201208.1.git63058eff.fc38.noar 590/590 Verifying : asmjit-1:0-20220702.1.gitc5984762.fc38.x86_64 1/590 Verifying : asmjit-devel-1:0-20220702.1.gitc5984762.fc38.x86 2/590 Verifying : cpuinfo-1:0-20240327.0.gitf42f5eaf.fc38.x86_64 3/590 Verifying : cpuinfo-devel-1:0-20240327.0.gitf42f5eaf.fc38.x8 4/590 Verifying : cuda-gcc-12-12.3.1-1.fc38.x86_64 5/590 Verifying : cuda-gcc-12-c++-12.3.1-1.fc38.x86_64 6/590 Verifying : cutlass-3.4.1-20240215.0.cu12_3.fc38.x86_64 7/590 Verifying : cutlass-devel-3.4.1-20240215.0.cu12_3.fc38.x86_6 8/590 Verifying : fbgemm-0.7.0-20240315.0.git0049a2ca.fc38.x86_64 9/590 Verifying : fbgemm-devel-0.7.0-20240315.0.git0049a2ca.fc38.x 10/590 Verifying : foxi-0-20210526.1.gitc278588e.fc37.x86_64 11/590 Verifying : foxi-devel-0-20210526.1.gitc278588e.fc37.x86_64 12/590 Verifying : fp16-1:0-20240410.0.git581ac1c7.fc38.x86_64 13/590 Verifying : fp16-devel-1:0-20240410.0.git581ac1c7.fc38.x86_6 14/590 Verifying : fxdiv-devel-1:0-20201208.1.git63058eff.fc38.noar 15/590 Verifying : gemmlowp-devel-0-20231104.0.git16e8662c.fc38.noa 16/590 Verifying : gklib-5.1.1-20230326.0.git8bd6bad7.fc38.x86_64 17/590 Verifying : gloo-1:0.5.0-20240302.0.git2565674c.cu12_3.fc38. 18/590 Verifying : gloo-devel-1:0.5.0-20240302.0.git2565674c.cu12_3 19/590 Verifying : halide-17.0.1-20240220.0.fc38.x86_64 20/590 Verifying : kineto-0.4.0-20240327.0.git445909a8.cu12_3.fc38. 21/590 Verifying : kineto-devel-0.4.0-20240327.0.git445909a8.cu12_3 22/590 Verifying : magma-2.8.0-20240328.0.cu12_3.fc38.x86_64 23/590 Verifying : magma-devel-2.8.0-20240328.0.cu12_3.fc38.x86_64 24/590 Verifying : metis-5.2.1-20230403.0.gite0f1b88b.fc38.x86_64 25/590 Verifying : neon2sse-devel-0-20230131.0.git097a5eca.fc38.noa 26/590 Verifying : nnpack-0-20230201.0.git70a77f48.fc38.x86_64 27/590 Verifying : nnpack-devel-0-20230201.0.git70a77f48.fc38.x86_6 28/590 Verifying : onnx-devel-1.17.0-20240404.0.git4128a090.fc38.x8 29/590 Verifying : onnx-libs-1.17.0-20240404.0.git4128a090.fc38.x86 30/590 Verifying : onnx-optimizer-0.3.19-20240303.0.gitb3a46118.fc3 31/590 Verifying : onnx-optimizer-devel-0.3.19-20240303.0.gitb3a461 32/590 Verifying : opencv-4.9.0-20231227.1.cu12_3.fc38.x86_64 33/590 Verifying : opencv-contrib-4.9.0-20231227.1.cu12_3.fc38.x86_ 34/590 Verifying : opencv-core-4.9.0-20231227.1.cu12_3.fc38.x86_64 35/590 Verifying : opencv-cuda-4.9.0-20231227.1.cu12_3.fc38.x86_64 36/590 Verifying : opencv-devel-4.9.0-20231227.1.cu12_3.fc38.x86_64 37/590 Verifying : opencv-static-4.9.0-20231227.1.cu12_3.fc38.x86_6 38/590 Verifying : peachpy-python3-0-20221113.1.git349e8f83.fc38.no 39/590 Verifying : protobuf-compat-3.21.9-2.fc38.x86_64 40/590 Verifying : protobuf-compat-compiler-3.21.9-2.fc38.x86_64 41/590 Verifying : protobuf-compat-devel-3.21.9-2.fc38.x86_64 42/590 Verifying : psimd-devel-1:0-20200517.2.git072586a7.fc38.noar 43/590 Verifying : pthreadpool-1:0.1-20240121.0.git178e3e06.fc38.x8 44/590 Verifying : pthreadpool-devel-1:0.1-20240121.0.git178e3e06.f 45/590 Verifying : qnnpack-0-20190828.2.git7d2a4e99.fc38.x86_64 46/590 Verifying : qnnpack-devel-0-20190828.2.git7d2a4e99.fc38.x86_ 47/590 Verifying : sleef-3.6-20240320.0.git60e76d2b.fc38.x86_64 48/590 Verifying : sleef-devel-3.6-20240320.0.git60e76d2b.fc38.x86_ 49/590 Verifying : tensorpipe-0-20220513.1.gitbb1473a4.fc37.x86_64 50/590 Verifying : tensorpipe-devel-0-20220513.1.gitbb1473a4.fc37.x 51/590 Verifying : libcublas-12-3-12.3.4.1-2.x86_64 52/590 Verifying : libcublas-devel-12-3-12.3.4.1-2.x86_64 53/590 Verifying : libcudnn8-8.9.7.29-2.cuda12.3.x86_64 54/590 Verifying : libcudnn8-devel-8.9.7.29-2.cuda12.3.x86_64 55/590 Verifying : libcufft-12-3-11.0.12.1-2.x86_64 56/590 Verifying : libcufft-devel-12-3-11.0.12.1-2.x86_64 57/590 Verifying : libcusolver-12-3-11.5.4.101-2.x86_64 58/590 Verifying : libcusolver-devel-12-3-11.5.4.101-2.x86_64 59/590 Verifying : libcusparse-12-3-12.2.0.103-2.x86_64 60/590 Verifying : libcusparse-devel-12-3-12.2.0.103-2.x86_64 61/590 Verifying : libnpp-12-3-12.2.3.2-2.x86_64 62/590 Verifying : cuda-cccl-12-3-12.3.101-1.x86_64 63/590 Verifying : cuda-crt-12-3-12.3.107-1.x86_64 64/590 Verifying : cuda-cudart-12-3-12.3.101-1.x86_64 65/590 Verifying : cuda-cudart-devel-12-3-12.3.101-1.x86_64 66/590 Verifying : cuda-cupti-12-3-12.3.101-1.x86_64 67/590 Verifying : cuda-driver-devel-12-3-12.3.101-1.x86_64 68/590 Verifying : cuda-nvcc-12-3-12.3.107-1.x86_64 69/590 Verifying : cuda-nvml-devel-12-3-12.3.101-1.x86_64 70/590 Verifying : cuda-nvrtc-12-3-12.3.107-1.x86_64 71/590 Verifying : cuda-nvrtc-devel-12-3-12.3.107-1.x86_64 72/590 Verifying : cuda-nvtx-12-3-12.3.101-1.x86_64 73/590 Verifying : cuda-nvvm-12-3-12.3.107-1.x86_64 74/590 Verifying : cuda-profiler-api-12-3-12.3.101-1.x86_64 75/590 Verifying : cuda-toolkit-12-3-config-common-12.3.101-1.noarc 76/590 Verifying : cuda-toolkit-12-config-common-12.4.127-1.noarch 77/590 Verifying : cuda-toolkit-config-common-12.4.127-1.noarch 78/590 Verifying : libcurand-12-3-10.3.4.107-1.x86_64 79/590 Verifying : libcurand-devel-12-3-10.3.4.107-1.x86_64 80/590 Verifying : libnccl-2.21.5-1+cuda12.4.x86_64 81/590 Verifying : libnccl-devel-2.21.5-1+cuda12.4.x86_64 82/590 Verifying : libnvjitlink-12-3-12.3.101-1.x86_64 83/590 Verifying : libnvjitlink-devel-12-3-12.3.101-1.x86_64 84/590 Verifying : MUMPS-5.5.1-1.fc38.x86_64 85/590 Verifying : MUMPS-common-5.5.1-1.fc38.noarch 86/590 Verifying : SuperLU-5.3.0-4.fc38.x86_64 87/590 Verifying : adobe-mappings-pdf-20190401-3.fc38.noarch 88/590 Verifying : arpack-3.8.0-6.fc38.x86_64 89/590 Verifying : byte-buddy-1.12.10-3.fc38.noarch 90/590 Verifying : byte-buddy-agent-1.12.10-3.fc38.noarch 91/590 Verifying : cdparanoia-libs-10.2-41.fc38.x86_64 92/590 Verifying : ceres-solver-2.1.0-5.fc38.x86_64 93/590 Verifying : cfitsio-4.2.0-3.fc38.x86_64 94/590 Verifying : cgnslib-libs-4.3.0-7.fc38.x86_64 95/590 Verifying : cjson-1.7.14-7.fc38.x86_64 96/590 Verifying : cliquer-libs-1.22-5.fc38.x86_64 97/590 Verifying : codec2-1.0.5-2.fc38.x86_64 98/590 Verifying : coin-or-Cbc-2.10.5-12.fc38.x86_64 99/590 Verifying : coin-or-Cgl-0.60.3-9.fc38.x86_64 100/590 Verifying : coin-or-Clp-1.17.6-12.fc38.x86_64 101/590 Verifying : coin-or-CoinUtils-2.11.4-9.fc38.x86_64 102/590 Verifying : coin-or-Osi-0.108.6-8.fc38.x86_64 103/590 Verifying : copy-jdk-configs-4.1-2.fc38.noarch 104/590 Verifying : crypto-policies-scripts-20230301-1.gita12f7b2.fc 105/590 Verifying : dbus-broker-33-1.fc38.x86_64 106/590 Verifying : double-conversion-3.1.5-8.fc38.x86_64 107/590 Verifying : doxygen-2:1.9.6-7.fc38.x86_64 108/590 Verifying : eigen3-devel-3.4.0-9.fc38.noarch 109/590 Verifying : fdk-aac-free-2.0.0-10.fc38.x86_64 110/590 Verifying : flatbuffers-23.3.3-1.fc38.x86_64 111/590 Verifying : flatbuffers-compiler-23.3.3-1.fc38.x86_64 112/590 Verifying : flatbuffers-devel-23.3.3-1.fc38.x86_64 113/590 Verifying : fonts-filesystem-1:2.0.5-11.fc38.noarch 114/590 Verifying : freetype-2.13.0-2.fc38.x86_64 115/590 Verifying : freexl-1.0.6-21.fc38.x86_64 116/590 Verifying : fribidi-1.0.12-3.fc38.x86_64 117/590 Verifying : game-music-emu-0.6.3-11.fc38.x86_64 118/590 Verifying : gc-8.2.2-3.fc38.x86_64 119/590 Verifying : gd-2.3.3-10.fc38.x86_64 120/590 Verifying : gdk-pixbuf2-2.42.10-2.fc38.x86_64 121/590 Verifying : gdk-pixbuf2-modules-2.42.10-2.fc38.x86_64 122/590 Verifying : gecode-6.2.0-11.fc38.x86_64 123/590 Verifying : geos-3.11.1-3.fc38.x86_64 124/590 Verifying : gflags-2.2.2-11.fc38.x86_64 125/590 Verifying : gflags-devel-2.2.2-11.fc38.x86_64 126/590 Verifying : gl-manpages-1.1-26.20190306.fc38.noarch 127/590 Verifying : glog-0.3.5-17.fc38.x86_64 128/590 Verifying : glog-devel-0.3.5-17.fc38.x86_64 129/590 Verifying : glpk-5.0-6.fc38.x86_64 130/590 Verifying : glx-utils-8.5.0-1.fc38.x86_64 131/590 Verifying : gmp-c++-1:6.2.1-4.fc38.x86_64 132/590 Verifying : gmp-devel-1:6.2.1-4.fc38.x86_64 133/590 Verifying : graphene-1.10.6-5.fc38.x86_64 134/590 Verifying : graphite2-1.3.14-11.fc38.x86_64 135/590 Verifying : groff-base-1.22.4-11.fc38.x86_64 136/590 Verifying : gsl-2.7.1-4.fc38.x86_64 137/590 Verifying : gsm-1.0.22-2.fc38.x86_64 138/590 Verifying : gts-0.7.6-44.20121130.fc38.x86_64 139/590 Verifying : guile22-2.2.7-7.fc38.x86_64 140/590 Verifying : harfbuzz-7.1.0-1.fc38.x86_64 141/590 Verifying : hdf-libs-4.2.15-12.fc38.x86_64 142/590 Verifying : hdf5-1.12.1-11.fc38.x86_64 143/590 Verifying : hiredis-1.0.2-4.fc38.x86_64 144/590 Verifying : hiredis-devel-1.0.2-4.fc38.x86_64 145/590 Verifying : ilbc-3.0.4-4.fc38.x86_64 146/590 Verifying : infiniband-diags-44.0-3.fc38.x86_64 147/590 Verifying : isl-0.16.1-17.fc38.x86_64 148/590 Verifying : iso-codes-4.13.0-1.fc38.noarch 149/590 Verifying : jacop-4.9.0-1.fc38.noarch 150/590 Verifying : javapackages-filesystem-6.1.0-7.fc38.noarch 151/590 Verifying : javapackages-tools-6.1.0-7.fc38.noarch 152/590 Verifying : jbig2dec-libs-0.19-8.fc38.x86_64 153/590 Verifying : jbigkit-libs-2.1-25.fc38.x86_64 154/590 Verifying : jsoncpp-1.9.5-4.fc38.x86_64 155/590 Verifying : kmod-libs-30-4.fc38.x86_64 156/590 Verifying : lame-libs-3.100-14.fc38.x86_64 157/590 Verifying : lasi-1.1.3-10.fc38.x86_64 158/590 Verifying : lcms2-2.15-1.fc38.x86_64 159/590 Verifying : leveldb-1.23-6.fc38.x86_64 160/590 Verifying : leveldb-devel-1.23-6.fc38.x86_64 161/590 Verifying : libGLEW-2.2.0-4.fc38.x86_64 162/590 Verifying : libICE-1.0.10-10.fc38.x86_64 163/590 Verifying : libSM-1.2.3-12.fc38.x86_64 164/590 Verifying : libXau-1.0.11-2.fc38.x86_64 165/590 Verifying : libXau-devel-1.0.11-2.fc38.x86_64 166/590 Verifying : libXcursor-1.2.1-3.fc38.x86_64 167/590 Verifying : libXext-1.3.5-2.fc38.x86_64 168/590 Verifying : libXfixes-6.0.0-5.fc38.x86_64 169/590 Verifying : libXrender-0.9.11-2.fc38.x86_64 170/590 Verifying : libXt-1.2.1-4.fc38.x86_64 171/590 Verifying : libXv-1.0.11-18.fc38.x86_64 172/590 Verifying : libXxf86vm-1.1.5-2.fc38.x86_64 173/590 Verifying : libaec-1.0.6-4.fc38.x86_64 174/590 Verifying : libavif-0.11.1-7.fc38.x86_64 175/590 Verifying : libb2-0.98.1-8.fc38.x86_64 176/590 Verifying : libbluray-1.3.4-2.fc38.x86_64 177/590 Verifying : libcbor-0.7.0-9.fc38.x86_64 178/590 Verifying : libchromaprint-1.5.1-8.fc38.x86_64 179/590 Verifying : libcom_err-devel-1.46.5-4.fc38.x86_64 180/590 Verifying : libdatrie-0.2.13-5.fc38.x86_64 181/590 Verifying : libdc1394-2.2.6-9.fc38.x86_64 182/590 Verifying : libedit-3.1-45.20221030cvs.fc38.x86_64 183/590 Verifying : libfido2-1.12.0-3.fc38.x86_64 184/590 Verifying : libfontenc-1.1.6-2.fc38.x86_64 185/590 Verifying : libgeotiff-1.7.1-6.fc38.x86_64 186/590 Verifying : libglvnd-1:1.6.0-2.fc38.x86_64 187/590 Verifying : libglvnd-core-devel-1:1.6.0-2.fc38.x86_64 188/590 Verifying : libglvnd-devel-1:1.6.0-2.fc38.x86_64 189/590 Verifying : libglvnd-egl-1:1.6.0-2.fc38.x86_64 190/590 Verifying : libglvnd-gles-1:1.6.0-2.fc38.x86_64 191/590 Verifying : libglvnd-glx-1:1.6.0-2.fc38.x86_64 192/590 Verifying : libglvnd-opengl-1:1.6.0-2.fc38.x86_64 193/590 Verifying : libgta-1.2.1-9.fc38.x86_64 194/590 Verifying : libgudev-237-4.fc38.x86_64 195/590 Verifying : libharu-2.4.3-2.fc38.x86_64 196/590 Verifying : libibumad-44.0-3.fc38.x86_64 197/590 Verifying : libibverbs-44.0-3.fc38.x86_64 198/590 Verifying : libicu-72.1-2.fc38.x86_64 199/590 Verifying : libijs-0.35-17.fc38.x86_64 200/590 Verifying : libimagequant-2.17.0-4.fc38.x86_64 201/590 Verifying : libjpeg-turbo-2.1.4-2.fc38.x86_64 202/590 Verifying : libjxl-1:0.7.0-6.fc38.x86_64 203/590 Verifying : libkml-1.3.0-43.fc38.x86_64 204/590 Verifying : libldb-2.7.2-1.fc38.x86_64 205/590 Verifying : liblerc-4.0.0-3.fc38.x86_64 206/590 Verifying : libmodplug-1:0.8.9.0-16.fc38.x86_64 207/590 Verifying : libmpc-1.3.1-2.fc38.x86_64 208/590 Verifying : libnl3-3.7.0-3.fc38.x86_64 209/590 Verifying : libogg-2:1.3.5-5.fc38.x86_64 210/590 Verifying : libpaper-1:2.0.8-1.fc38.x86_64 211/590 Verifying : libpciaccess-0.16-8.fc38.x86_64 212/590 Verifying : libpng-2:1.6.37-14.fc38.x86_64 213/590 Verifying : libproxy-0.4.18-6.fc38.x86_64 214/590 Verifying : libqhull_r-1:7.2.1-12.fc38.x86_64 215/590 Verifying : librabbitmq-0.13.0-1.fc38.x86_64 216/590 Verifying : libraw1394-2.1.2-17.fc38.x86_64 217/590 Verifying : librdmacm-44.0-3.fc38.x86_64 218/590 Verifying : librist-0.2.7-1.fc38.x86_64 219/590 Verifying : librttopo-1.1.0-11.fc38.x86_64 220/590 Verifying : libseccomp-2.5.3-4.fc38.x86_64 221/590 Verifying : libselinux-devel-3.5-1.fc38.x86_64 222/590 Verifying : libsepol-devel-3.5-1.fc38.x86_64 223/590 Verifying : libsodium-1.0.18-11.fc38.x86_64 224/590 Verifying : libsodium-devel-1.0.18-11.fc38.x86_64 225/590 Verifying : libspatialite-5.0.1-20.fc38.x86_64 226/590 Verifying : libtalloc-2.4.0-2.fc38.x86_64 227/590 Verifying : libtdb-1.4.8-1.fc38.x86_64 228/590 Verifying : libtevent-0.14.1-1.fc38.x86_64 229/590 Verifying : libthai-0.1.29-4.fc38.x86_64 230/590 Verifying : libtheora-1:1.1.1-33.fc38.x86_64 231/590 Verifying : libtool-ltdl-2.4.7-6.fc38.x86_64 232/590 Verifying : libudfread-1.1.2-5.fc38.x86_64 233/590 Verifying : libunwind-1.6.2-7.fc38.x86_64 234/590 Verifying : libunwind-devel-1.6.2-7.fc38.x86_64 235/590 Verifying : libva-2.18.0-1.fc38.x86_64 236/590 Verifying : libvdpau-1.5-3.fc38.x86_64 237/590 Verifying : libverto-devel-0.3.2-5.fc38.x86_64 238/590 Verifying : libvisual-1:0.4.1-1.fc38.x86_64 239/590 Verifying : libvmaf-2.3.0-5.fc38.x86_64 240/590 Verifying : libvorbis-1:1.3.7-7.fc38.x86_64 241/590 Verifying : libxcb-1.13.1-11.fc38.x86_64 242/590 Verifying : libxcb-devel-1.13.1-11.fc38.x86_64 243/590 Verifying : libxkbcommon-1.5.0-2.fc38.x86_64 244/590 Verifying : libxkbcommon-x11-1.5.0-2.fc38.x86_64 245/590 Verifying : libxshmfence-1.3-12.fc38.x86_64 246/590 Verifying : libyaml-0.2.5-9.fc38.x86_64 247/590 Verifying : lksctp-tools-1.0.19-3.fc38.x86_64 248/590 Verifying : llvm15-libs-15.0.7-4.fc38.x86_64 249/590 Verifying : lpcnetfreedv-0.2-13.fc38.x86_64 250/590 Verifying : lua-5.4.4-9.fc38.x86_64 251/590 Verifying : lua-filesystem-1.8.0-8.fc38.x86_64 252/590 Verifying : lua-json-1.3.4-3.fc38.noarch 253/590 Verifying : lua-lpeg-1.0.2-10.fc38.x86_64 254/590 Verifying : lua-posix-35.1-5.fc38.x86_64 255/590 Verifying : lua-term-0.07-17.fc38.x86_64 256/590 Verifying : miniz-3.0.2-2.fc38.x86_64 257/590 Verifying : miniz-devel-3.0.2-2.fc38.x86_64 258/590 Verifying : mkfontscale-1.2.2-3.fc38.x86_64 259/590 Verifying : mockito-3.12.4-6.fc38.noarch 260/590 Verifying : mp-3.1.0-41.20200303git7fd4828.fc38.x86_64 261/590 Verifying : mpdecimal-2.5.1-6.fc38.x86_64 262/590 Verifying : mpfr-devel-4.1.1-3.fc38.x86_64 263/590 Verifying : mpg123-libs-1.31.3-1.fc38.x86_64 264/590 Verifying : mtdev-1.1.6-5.fc38.x86_64 265/590 Verifying : netcdf-4.9.0-5.fc38.x86_64 266/590 Verifying : netpbm-11.02.00-1.fc38.x86_64 267/590 Verifying : nettle-3.8-3.fc38.x86_64 268/590 Verifying : numactl-devel-2.0.16-2.fc38.x86_64 269/590 Verifying : numactl-libs-2.0.16-2.fc38.x86_64 270/590 Verifying : objectweb-asm-9.3-5.fc38.noarch 271/590 Verifying : objenesis-3.3-2.fc38.noarch 272/590 Verifying : ogdi-4.1.0-10.fc38.x86_64 273/590 Verifying : openblas-0.3.21-4.fc38.x86_64 274/590 Verifying : openblas-devel-0.3.21-4.fc38.x86_64 275/590 Verifying : openblas-openmp-0.3.21-4.fc38.x86_64 276/590 Verifying : openblas-openmp64-0.3.21-4.fc38.x86_64 277/590 Verifying : openblas-openmp64_-0.3.21-4.fc38.x86_64 278/590 Verifying : openblas-serial-0.3.21-4.fc38.x86_64 279/590 Verifying : openblas-serial64-0.3.21-4.fc38.x86_64 280/590 Verifying : openblas-serial64_-0.3.21-4.fc38.x86_64 281/590 Verifying : openblas-threads-0.3.21-4.fc38.x86_64 282/590 Verifying : openblas-threads64-0.3.21-4.fc38.x86_64 283/590 Verifying : openblas-threads64_-0.3.21-4.fc38.x86_64 284/590 Verifying : opencore-amr-0.1.6-3.fc38.x86_64 285/590 Verifying : openpgm-5.2.122-31.fc38.x86_64 286/590 Verifying : openpgm-devel-5.2.122-31.fc38.x86_64 287/590 Verifying : openslide-3.4.1-23.fc38.x86_64 288/590 Verifying : opentest4j-1.2.0-12.fc38.noarch 289/590 Verifying : opus-1.3.1-12.fc38.x86_64 290/590 Verifying : orc-0.4.33-2.fc38.x86_64 291/590 Verifying : pango-1.50.14-1.fc38.x86_64 292/590 Verifying : pcre-8.45-1.fc38.3.x86_64 293/590 Verifying : pcre2-devel-10.42-1.fc38.1.x86_64 294/590 Verifying : pcre2-utf16-10.42-1.fc38.1.x86_64 295/590 Verifying : pcre2-utf32-10.42-1.fc38.1.x86_64 296/590 Verifying : perl-Carp-1.52-490.fc38.noarch 297/590 Verifying : perl-Data-Dumper-2.184-491.fc38.x86_64 298/590 Verifying : perl-Digest-1.20-490.fc38.noarch 299/590 Verifying : perl-Digest-MD5-2.58-490.fc38.x86_64 300/590 Verifying : perl-Encode-4:3.19-493.fc38.x86_64 301/590 Verifying : perl-Error-1:0.17029-11.fc38.noarch 302/590 Verifying : perl-Exporter-5.77-490.fc38.noarch 303/590 Verifying : perl-File-Path-2.18-490.fc38.noarch 304/590 Verifying : perl-File-Temp-1:0.231.100-490.fc38.noarch 305/590 Verifying : perl-Getopt-Long-1:2.54-2.fc38.noarch 306/590 Verifying : perl-IO-Socket-IP-0.41-492.fc38.noarch 307/590 Verifying : perl-IO-Socket-SSL-2.081-1.fc38.noarch 308/590 Verifying : perl-MIME-Base64-3.16-490.fc38.x86_64 309/590 Verifying : perl-Mozilla-CA-20221114-2.fc38.noarch 310/590 Verifying : perl-Net-SSLeay-1.92-5.fc38.x86_64 311/590 Verifying : perl-PathTools-3.84-490.fc38.x86_64 312/590 Verifying : perl-Pod-Escapes-1:1.07-490.fc38.noarch 313/590 Verifying : perl-Pod-Perldoc-3.28.01-491.fc38.noarch 314/590 Verifying : perl-Pod-Simple-1:3.43-491.fc38.noarch 315/590 Verifying : perl-Pod-Usage-4:2.03-4.fc38.noarch 316/590 Verifying : perl-Scalar-List-Utils-5:1.63-490.fc38.x86_64 317/590 Verifying : perl-Socket-4:2.036-2.fc38.x86_64 318/590 Verifying : perl-Storable-1:3.26-490.fc38.x86_64 319/590 Verifying : perl-Term-ANSIColor-5.01-491.fc38.noarch 320/590 Verifying : perl-Term-Cap-1.18-1.fc38.noarch 321/590 Verifying : perl-TermReadKey-2.38-16.fc38.x86_64 322/590 Verifying : perl-Text-ParseWords-3.31-490.fc38.noarch 323/590 Verifying : perl-Time-Local-2:1.300-490.fc38.noarch 324/590 Verifying : perl-URI-5.17-2.fc38.noarch 325/590 Verifying : perl-constant-1.33-491.fc38.noarch 326/590 Verifying : perl-libnet-3.15-1.fc38.noarch 327/590 Verifying : perl-parent-1:0.241-1.fc38.noarch 328/590 Verifying : perl-podlators-1:5.01-2.fc38.noarch 329/590 Verifying : pixman-0.42.2-1.fc38.x86_64 330/590 Verifying : poppler-data-0.4.11-4.fc38.noarch 331/590 Verifying : proj-9.1.1-1.fc38.x86_64 332/590 Verifying : proj-data-9.1.1-1.fc38.noarch 333/590 Verifying : protobuf-3.19.6-2.fc38.x86_64 334/590 Verifying : pugixml-1.13-2.fc38.x86_64 335/590 Verifying : pybind11-devel-2.10.3-2.fc38.x86_64 336/590 Verifying : python-rpm-macros-3.11-10.fc38.noarch 337/590 Verifying : python-setuptools-wheel-65.5.1-2.fc38.noarch 338/590 Verifying : python3-packaging-23.0-1.fc38.noarch 339/590 Verifying : python3-pybind11-2.10.3-2.fc38.x86_64 340/590 Verifying : python3-pyyaml-6.0-6.fc38.x86_64 341/590 Verifying : python3-rpm-macros-3.11-10.fc38.noarch 342/590 Verifying : python3-setuptools-65.5.1-2.fc38.noarch 343/590 Verifying : python3-six-1.16.0-9.fc38.noarch 344/590 Verifying : python3-typing-extensions-4.5.0-1.fc38.noarch 345/590 Verifying : rdma-core-devel-44.0-3.fc38.x86_64 346/590 Verifying : rhash-1.4.3-2.fc38.x86_64 347/590 Verifying : rocksdb-7.8.3-1.fc38.x86_64 348/590 Verifying : rocksdb-devel-7.8.3-1.fc38.x86_64 349/590 Verifying : scotch-6.1.2-3.fc37.x86_64 350/590 Verifying : shared-mime-info-2.2-3.fc38.x86_64 351/590 Verifying : snappy-1.1.9-7.fc38.x86_64 352/590 Verifying : snappy-devel-1.1.9-7.fc38.x86_64 353/590 Verifying : soxr-0.1.3-13.fc38.x86_64 354/590 Verifying : speex-1.2.0-13.fc38.x86_64 355/590 Verifying : suitesparse-5.13.0-2.fc38.x86_64 356/590 Verifying : svt-av1-libs-1.4.1-2.fc38.x86_64 357/590 Verifying : tbb-2020.3-16.fc38.x86_64 358/590 Verifying : tbb-devel-2020.3-16.fc38.x86_64 359/590 Verifying : tcl-1:8.6.12-4.fc38.x86_64 360/590 Verifying : twolame-libs-0.4.0-2.fc38.x86_64 361/590 Verifying : unixODBC-2.3.11-2.fc38.x86_64 362/590 Verifying : uriparser-0.9.7-2.fc38.x86_64 363/590 Verifying : urw-base35-bookman-fonts-20200910-16.fc38.noarch 364/590 Verifying : urw-base35-c059-fonts-20200910-16.fc38.noarch 365/590 Verifying : urw-base35-d050000l-fonts-20200910-16.fc38.noarc 366/590 Verifying : urw-base35-fonts-20200910-16.fc38.noarch 367/590 Verifying : urw-base35-fonts-common-20200910-16.fc38.noarch 368/590 Verifying : urw-base35-gothic-fonts-20200910-16.fc38.noarch 369/590 Verifying : urw-base35-nimbus-mono-ps-fonts-20200910-16.fc38 370/590 Verifying : urw-base35-nimbus-roman-fonts-20200910-16.fc38.n 371/590 Verifying : urw-base35-nimbus-sans-fonts-20200910-16.fc38.no 372/590 Verifying : urw-base35-p052-fonts-20200910-16.fc38.noarch 373/590 Verifying : urw-base35-standard-symbols-ps-fonts-20200910-16 374/590 Verifying : urw-base35-z003-fonts-20200910-16.fc38.noarch 375/590 Verifying : vapoursynth-libs-58-4.fc38.x86_64 376/590 Verifying : vo-amrwbenc-0.1.3-18.fc38.x86_64 377/590 Verifying : vtk-9.2.5-2.fc38.x86_64 378/590 Verifying : xcb-util-0.4.1-2.fc38.x86_64 379/590 Verifying : xcb-util-image-0.4.1-2.fc38.x86_64 380/590 Verifying : xcb-util-keysyms-0.4.1-2.fc38.x86_64 381/590 Verifying : xcb-util-renderutil-0.3.10-2.fc38.x86_64 382/590 Verifying : xcb-util-wm-0.4.2-2.fc38.x86_64 383/590 Verifying : xkeyboard-config-2.38-1.fc38.noarch 384/590 Verifying : xml-common-0.6.3-60.fc38.noarch 385/590 Verifying : xorg-x11-fonts-ISO8859-1-100dpi-7.5-35.fc38.noar 386/590 Verifying : xorg-x11-proto-devel-2022.2-3.fc38.noarch 387/590 Verifying : xvidcore-1.3.7-9.fc38.x86_64 388/590 Verifying : zeromq-4.3.4-5.fc38.x86_64 389/590 Verifying : zeromq-devel-4.3.4-5.fc38.x86_64 390/590 Verifying : zlib-devel-1.2.13-3.fc38.x86_64 391/590 Verifying : zvbi-0.2.35-19.fc38.x86_64 392/590 Verifying : Lmod-8.7.32-1.fc38.x86_64 393/590 Verifying : adobe-mappings-cmap-20230622-1.fc38.noarch 394/590 Verifying : adobe-mappings-cmap-deprecated-20230622-1.fc38.n 395/590 Verifying : alsa-lib-1.2.11-2.fc38.x86_64 396/590 Verifying : annobin-docs-12.40-1.fc38.noarch 397/590 Verifying : annobin-plugin-gcc-12.40-1.fc38.x86_64 398/590 Verifying : armadillo-12.8.1-1.fc38.x86_64 399/590 Verifying : avahi-libs-0.8-22.fc38.x86_64 400/590 Verifying : blosc-1.21.5-2.fc38.x86_64 401/590 Verifying : cairo-1.17.8-4.fc38.x86_64 402/590 Verifying : cairo-gobject-1.17.8-4.fc38.x86_64 403/590 Verifying : clang15-libs-15.0.7-5.fc38.x86_64 404/590 Verifying : clang15-resource-filesystem-15.0.7-5.fc38.x86_64 405/590 Verifying : cmake-3.27.7-1.fc38.x86_64 406/590 Verifying : cmake-data-3.27.7-1.fc38.noarch 407/590 Verifying : cmake-filesystem-3.27.7-1.fc38.x86_64 408/590 Verifying : cmake-rpm-macros-3.27.7-1.fc38.noarch 409/590 Verifying : cpp-13.2.1-7.fc38.x86_64 410/590 Verifying : cups-libs-1:2.4.7-11.fc38.x86_64 411/590 Verifying : dbus-1:1.14.10-1.fc38.x86_64 412/590 Verifying : dbus-common-1:1.14.10-1.fc38.noarch 413/590 Verifying : dbus-libs-1:1.14.10-1.fc38.x86_64 414/590 Verifying : emacs-filesystem-1:29.3-1.fc38.noarch 415/590 Verifying : expat-2.6.0-1.fc38.x86_64 416/590 Verifying : fftw-3.3.10-10.fc38.x86_64 417/590 Verifying : fftw-devel-3.3.10-10.fc38.x86_64 418/590 Verifying : fftw-libs-3.3.10-10.fc38.x86_64 419/590 Verifying : fftw-libs-double-3.3.10-10.fc38.x86_64 420/590 Verifying : fftw-libs-long-3.3.10-10.fc38.x86_64 421/590 Verifying : fftw-libs-quad-3.3.10-10.fc38.x86_64 422/590 Verifying : fftw-libs-single-3.3.10-10.fc38.x86_64 423/590 Verifying : flexiblas-3.4.2-1.fc38.x86_64 424/590 Verifying : flexiblas-netlib-3.4.2-1.fc38.x86_64 425/590 Verifying : flexiblas-netlib64-3.4.2-1.fc38.x86_64 426/590 Verifying : flexiblas-openblas-openmp-3.4.2-1.fc38.x86_64 427/590 Verifying : flexiblas-openblas-openmp64-3.4.2-1.fc38.x86_64 428/590 Verifying : fontconfig-2.14.2-2.fc38.x86_64 429/590 Verifying : gcc-13.2.1-7.fc38.x86_64 430/590 Verifying : gcc-c++-13.2.1-7.fc38.x86_64 431/590 Verifying : gcc-plugin-annobin-13.2.1-7.fc38.x86_64 432/590 Verifying : gdal-libs-3.6.4-2.fc38.x86_64 433/590 Verifying : giflib-5.2.2-1.fc38.x86_64 434/590 Verifying : git-2.44.0-1.fc38.x86_64 435/590 Verifying : git-core-2.44.0-1.fc38.x86_64 436/590 Verifying : git-core-doc-2.44.0-1.fc38.noarch 437/590 Verifying : glib2-2.76.6-1.fc38.x86_64 438/590 Verifying : glibc-devel-2.37-18.fc38.x86_64 439/590 Verifying : glibc-headers-x86-2.37-18.fc38.noarch 440/590 Verifying : gnutls-3.8.4-1.fc38.x86_64 441/590 Verifying : google-droid-sans-fonts-20200215-15.fc38.noarch 442/590 Verifying : google-noto-fonts-common-20230201-2.fc38.noarch 443/590 Verifying : google-noto-sans-vf-fonts-20230201-2.fc38.noarch 444/590 Verifying : graphviz-7.1.0-3.fc38.x86_64 445/590 Verifying : gstreamer1-1.22.9-1.fc38.x86_64 446/590 Verifying : gstreamer1-plugins-base-1.22.9-1.fc38.x86_64 447/590 Verifying : highway-1.1.0-1.fc38.x86_64 448/590 Verifying : hwdata-0.380-1.fc38.noarch 449/590 Verifying : imath-3.1.10-1.fc38.x86_64 450/590 Verifying : java-17-openjdk-headless-1:17.0.9.0.9-3.fc38.x86 451/590 Verifying : json-c-0.17-1.fc38.x86_64 452/590 Verifying : kernel-headers-6.8.3-100.fc38.x86_64 453/590 Verifying : keyutils-libs-devel-1.6.3-1.fc38.x86_64 454/590 Verifying : krb5-devel-1.21-3.fc38.x86_64 455/590 Verifying : langpacks-core-font-en-3.0-32.fc38.noarch 456/590 Verifying : less-633-1.fc38.x86_64 457/590 Verifying : libX11-1.8.7-1.fc38.x86_64 458/590 Verifying : libX11-common-1.8.7-1.fc38.noarch 459/590 Verifying : libX11-devel-1.8.7-1.fc38.x86_64 460/590 Verifying : libX11-xcb-1.8.7-1.fc38.x86_64 461/590 Verifying : libXft-2.3.8-2.fc38.x86_64 462/590 Verifying : libXi-1.8.1-1.fc38.x86_64 463/590 Verifying : libXpm-3.5.17-1.fc38.x86_64 464/590 Verifying : libaom-3.8.2-1.fc38.x86_64 465/590 Verifying : libavcodec-free-6.0.1-2.fc38.x86_64 466/590 Verifying : libavformat-free-6.0.1-2.fc38.x86_64 467/590 Verifying : libavutil-free-6.0.1-2.fc38.x86_64 468/590 Verifying : libdav1d-1.2.1-1.fc38.x86_64 469/590 Verifying : libdrm-2.4.120-1.fc38.x86_64 470/590 Verifying : libevdev-1.13.1-1.fc38.x86_64 471/590 Verifying : libgcrypt-1.10.2-1.fc38.x86_64 472/590 Verifying : libgfortran-13.2.1-7.fc38.x86_64 473/590 Verifying : libgpg-error-1.47-1.fc38.x86_64 474/590 Verifying : libgs-10.02.1-2.fc38.x86_64 475/590 Verifying : libinput-1.23.0-2.fc38.x86_64 476/590 Verifying : libkadm5-1.21-3.fc38.x86_64 477/590 Verifying : libnauty-2.8.6-5.fc38.x86_64 478/590 Verifying : libopenmpt-0.6.12-1.fc38.x86_64 479/590 Verifying : libpq-15.3-1.fc38.x86_64 480/590 Verifying : libquadmath-13.2.1-7.fc38.x86_64 481/590 Verifying : librsvg2-2.56.4-1.fc38.x86_64 482/590 Verifying : libsmbclient-2:4.18.11-1.fc38.x86_64 483/590 Verifying : libstdc++-devel-13.2.1-7.fc38.x86_64 484/590 Verifying : libswresample-free-6.0.1-2.fc38.x86_64 485/590 Verifying : libswscale-free-6.0.1-2.fc38.x86_64 486/590 Verifying : libtiff-4.4.0-8.fc38.x86_64 487/590 Verifying : liburing-2.4-2.fc38.x86_64 488/590 Verifying : libusb1-1.0.27-1.fc38.x86_64 489/590 Verifying : libuv-1:1.48.0-1.fc38.x86_64 490/590 Verifying : libuv-devel-1:1.48.0-1.fc38.x86_64 491/590 Verifying : libuv-static-1:1.48.0-1.fc38.x86_64 492/590 Verifying : libvpl-1:2.10.2-1.fc38.x86_64 493/590 Verifying : libvpx-1.13.1-1.fc38.x86_64 494/590 Verifying : libwacom-2.8.0-1.fc38.x86_64 495/590 Verifying : libwacom-data-2.8.0-1.fc38.noarch 496/590 Verifying : libwayland-client-1.22.0-1.fc38.x86_64 497/590 Verifying : libwayland-cursor-1.22.0-1.fc38.x86_64 498/590 Verifying : libwayland-egl-1.22.0-1.fc38.x86_64 499/590 Verifying : libwayland-server-1.22.0-1.fc38.x86_64 500/590 Verifying : libwbclient-2:4.18.11-1.fc38.x86_64 501/590 Verifying : libwebp-1.3.2-2.fc38.x86_64 502/590 Verifying : libxcrypt-devel-4.4.36-1.fc38.x86_64 503/590 Verifying : libzstd-devel-1.5.5-1.fc38.x86_64 504/590 Verifying : llvm-libs-16.0.6-3.fc38.x86_64 505/590 Verifying : lmdb-0.9.32-1.fc38.x86_64 506/590 Verifying : lmdb-devel-0.9.32-1.fc38.x86_64 507/590 Verifying : lmdb-libs-0.9.32-1.fc38.x86_64 508/590 Verifying : make-1:4.4.1-1.fc38.x86_64 509/590 Verifying : mariadb-connector-c-3.3.8-1.fc38.x86_64 510/590 Verifying : mariadb-connector-c-config-3.3.8-1.fc38.noarch 511/590 Verifying : mbedtls-2.28.7-1.fc38.x86_64 512/590 Verifying : mesa-filesystem-23.1.9-1.fc38.x86_64 513/590 Verifying : mesa-libEGL-23.1.9-1.fc38.x86_64 514/590 Verifying : mesa-libGL-23.1.9-1.fc38.x86_64 515/590 Verifying : mesa-libGLU-9.0.3-1.fc38.x86_64 516/590 Verifying : mesa-libGLU-devel-9.0.3-1.fc38.x86_64 517/590 Verifying : mesa-libgbm-23.1.9-1.fc38.x86_64 518/590 Verifying : mesa-libglapi-23.1.9-1.fc38.x86_64 519/590 Verifying : minizip-ng-3.0.7-4.fc38.x86_64 520/590 Verifying : ncurses-6.4-7.20230520.fc38.1.x86_64 521/590 Verifying : nspr-4.35.0-17.fc38.x86_64 522/590 Verifying : nss-3.99.0-1.fc38.x86_64 523/590 Verifying : nss-softokn-3.99.0-1.fc38.x86_64 524/590 Verifying : nss-softokn-freebl-3.99.0-1.fc38.x86_64 525/590 Verifying : nss-sysinit-3.99.0-1.fc38.x86_64 526/590 Verifying : nss-util-3.99.0-1.fc38.x86_64 527/590 Verifying : ocl-icd-2.3.2-1.fc38.x86_64 528/590 Verifying : ocl-icd-devel-2.3.2-1.fc38.x86_64 529/590 Verifying : opencl-headers-3.0-18.20231003git9ce9a72.fc38.no 530/590 Verifying : openexr-libs-3.1.10-1.fc38.x86_64 531/590 Verifying : openjpeg2-2.5.2-1.fc38.x86_64 532/590 Verifying : openssh-9.0p1-19.fc38.x86_64 533/590 Verifying : openssh-clients-9.0p1-19.fc38.x86_64 534/590 Verifying : perl-AutoLoader-5.74-498.fc38.noarch 535/590 Verifying : perl-B-1.83-498.fc38.x86_64 536/590 Verifying : perl-Class-Struct-0.66-498.fc38.noarch 537/590 Verifying : perl-DynaLoader-1.52-498.fc38.x86_64 538/590 Verifying : perl-Errno-1.36-498.fc38.x86_64 539/590 Verifying : perl-Fcntl-1.15-498.fc38.x86_64 540/590 Verifying : perl-File-Basename-2.85-498.fc38.noarch 541/590 Verifying : perl-File-Find-1.40-498.fc38.noarch 542/590 Verifying : perl-File-stat-1.12-498.fc38.noarch 543/590 Verifying : perl-FileHandle-2.03-498.fc38.noarch 544/590 Verifying : perl-Getopt-Std-1.13-498.fc38.noarch 545/590 Verifying : perl-Git-2.44.0-1.fc38.noarch 546/590 Verifying : perl-HTTP-Tiny-0.086-2.fc38.noarch 547/590 Verifying : perl-IO-1.50-498.fc38.x86_64 548/590 Verifying : perl-IPC-Open3-1.22-498.fc38.noarch 549/590 Verifying : perl-POSIX-2.03-498.fc38.x86_64 550/590 Verifying : perl-SelectSaver-1.02-498.fc38.noarch 551/590 Verifying : perl-Symbol-1.09-498.fc38.noarch 552/590 Verifying : perl-Text-Tabs+Wrap-2023.0511-1.fc38.noarch 553/590 Verifying : perl-base-2.27-498.fc38.noarch 554/590 Verifying : perl-if-0.61.000-498.fc38.noarch 555/590 Verifying : perl-interpreter-4:5.36.3-498.fc38.x86_64 556/590 Verifying : perl-lib-0.65-498.fc38.x86_64 557/590 Verifying : perl-libs-4:5.36.3-498.fc38.x86_64 558/590 Verifying : perl-locale-1.10-498.fc38.noarch 559/590 Verifying : perl-mro-1.26-498.fc38.x86_64 560/590 Verifying : perl-overload-1.35-498.fc38.noarch 561/590 Verifying : perl-overloading-0.02-498.fc38.noarch 562/590 Verifying : perl-vars-1.05-498.fc38.noarch 563/590 Verifying : poppler-23.02.0-3.fc38.x86_64 564/590 Verifying : poppler-glib-23.02.0-3.fc38.x86_64 565/590 Verifying : procps-ng-3.3.17-11.fc38.x86_64 566/590 Verifying : pyproject-rpm-macros-1.12.0-1.fc38.noarch 567/590 Verifying : python-pip-wheel-22.3.1-3.fc38.noarch 568/590 Verifying : python3-3.11.8-2.fc38.x86_64 569/590 Verifying : python3-devel-3.11.8-2.fc38.x86_64 570/590 Verifying : python3-libs-3.11.8-2.fc38.x86_64 571/590 Verifying : python3-numpy-1:1.24.4-1.fc38.x86_64 572/590 Verifying : python3-rpm-generators-14-4.fc38.noarch 573/590 Verifying : qt-settings-38.3-1.fc38.noarch 574/590 Verifying : qt5-qtbase-5.15.12-5.fc38.x86_64 575/590 Verifying : qt5-qtbase-common-5.15.12-5.fc38.noarch 576/590 Verifying : qt5-qtbase-gui-5.15.12-5.fc38.x86_64 577/590 Verifying : rav1e-libs-0.7.1-1.fc38.x86_64 578/590 Verifying : samba-client-libs-2:4.18.11-1.fc38.x86_64 579/590 Verifying : samba-common-2:4.18.11-1.fc38.noarch 580/590 Verifying : samba-common-libs-2:4.18.11-1.fc38.x86_64 581/590 Verifying : srt-libs-1.5.2-1.fc38.x86_64 582/590 Verifying : systemd-253.17-1.fc38.x86_64 583/590 Verifying : systemd-pam-253.17-1.fc38.x86_64 584/590 Verifying : systemd-rpm-macros-253.17-1.fc38.noarch 585/590 Verifying : tzdata-java-2024a-1.fc38.noarch 586/590 Verifying : vim-filesystem-2:9.1.264-1.fc38.noarch 587/590 Verifying : xapian-core-libs-1.4.23-1.fc38.x86_64 588/590 Verifying : xerces-c-3.2.5-1.fc38.x86_64 589/590 Verifying : zimg-3.0.5-1.fc38.x86_64 590/590 Installed: Lmod-8.7.32-1.fc38.x86_64 MUMPS-5.5.1-1.fc38.x86_64 MUMPS-common-5.5.1-1.fc38.noarch SuperLU-5.3.0-4.fc38.x86_64 adobe-mappings-cmap-20230622-1.fc38.noarch adobe-mappings-cmap-deprecated-20230622-1.fc38.noarch adobe-mappings-pdf-20190401-3.fc38.noarch alsa-lib-1.2.11-2.fc38.x86_64 annobin-docs-12.40-1.fc38.noarch annobin-plugin-gcc-12.40-1.fc38.x86_64 armadillo-12.8.1-1.fc38.x86_64 arpack-3.8.0-6.fc38.x86_64 asmjit-1:0-20220702.1.gitc5984762.fc38.x86_64 asmjit-devel-1:0-20220702.1.gitc5984762.fc38.x86_64 avahi-libs-0.8-22.fc38.x86_64 blosc-1.21.5-2.fc38.x86_64 byte-buddy-1.12.10-3.fc38.noarch byte-buddy-agent-1.12.10-3.fc38.noarch cairo-1.17.8-4.fc38.x86_64 cairo-gobject-1.17.8-4.fc38.x86_64 cdparanoia-libs-10.2-41.fc38.x86_64 ceres-solver-2.1.0-5.fc38.x86_64 cfitsio-4.2.0-3.fc38.x86_64 cgnslib-libs-4.3.0-7.fc38.x86_64 cjson-1.7.14-7.fc38.x86_64 clang15-libs-15.0.7-5.fc38.x86_64 clang15-resource-filesystem-15.0.7-5.fc38.x86_64 cliquer-libs-1.22-5.fc38.x86_64 cmake-3.27.7-1.fc38.x86_64 cmake-data-3.27.7-1.fc38.noarch cmake-filesystem-3.27.7-1.fc38.x86_64 cmake-rpm-macros-3.27.7-1.fc38.noarch codec2-1.0.5-2.fc38.x86_64 coin-or-Cbc-2.10.5-12.fc38.x86_64 coin-or-Cgl-0.60.3-9.fc38.x86_64 coin-or-Clp-1.17.6-12.fc38.x86_64 coin-or-CoinUtils-2.11.4-9.fc38.x86_64 coin-or-Osi-0.108.6-8.fc38.x86_64 copy-jdk-configs-4.1-2.fc38.noarch cpp-13.2.1-7.fc38.x86_64 cpuinfo-1:0-20240327.0.gitf42f5eaf.fc38.x86_64 cpuinfo-devel-1:0-20240327.0.gitf42f5eaf.fc38.x86_64 crypto-policies-scripts-20230301-1.gita12f7b2.fc38.noarch cuda-cccl-12-3-12.3.101-1.x86_64 cuda-crt-12-3-12.3.107-1.x86_64 cuda-cudart-12-3-12.3.101-1.x86_64 cuda-cudart-devel-12-3-12.3.101-1.x86_64 cuda-cupti-12-3-12.3.101-1.x86_64 cuda-driver-devel-12-3-12.3.101-1.x86_64 cuda-gcc-12-12.3.1-1.fc38.x86_64 cuda-gcc-12-c++-12.3.1-1.fc38.x86_64 cuda-nvcc-12-3-12.3.107-1.x86_64 cuda-nvml-devel-12-3-12.3.101-1.x86_64 cuda-nvrtc-12-3-12.3.107-1.x86_64 cuda-nvrtc-devel-12-3-12.3.107-1.x86_64 cuda-nvtx-12-3-12.3.101-1.x86_64 cuda-nvvm-12-3-12.3.107-1.x86_64 cuda-profiler-api-12-3-12.3.101-1.x86_64 cuda-toolkit-12-3-config-common-12.3.101-1.noarch cuda-toolkit-12-config-common-12.4.127-1.noarch cuda-toolkit-config-common-12.4.127-1.noarch cups-libs-1:2.4.7-11.fc38.x86_64 cutlass-3.4.1-20240215.0.cu12_3.fc38.x86_64 cutlass-devel-3.4.1-20240215.0.cu12_3.fc38.x86_64 dbus-1:1.14.10-1.fc38.x86_64 dbus-broker-33-1.fc38.x86_64 dbus-common-1:1.14.10-1.fc38.noarch dbus-libs-1:1.14.10-1.fc38.x86_64 double-conversion-3.1.5-8.fc38.x86_64 doxygen-2:1.9.6-7.fc38.x86_64 eigen3-devel-3.4.0-9.fc38.noarch emacs-filesystem-1:29.3-1.fc38.noarch expat-2.6.0-1.fc38.x86_64 fbgemm-0.7.0-20240315.0.git0049a2ca.fc38.x86_64 fbgemm-devel-0.7.0-20240315.0.git0049a2ca.fc38.x86_64 fdk-aac-free-2.0.0-10.fc38.x86_64 fftw-3.3.10-10.fc38.x86_64 fftw-devel-3.3.10-10.fc38.x86_64 fftw-libs-3.3.10-10.fc38.x86_64 fftw-libs-double-3.3.10-10.fc38.x86_64 fftw-libs-long-3.3.10-10.fc38.x86_64 fftw-libs-quad-3.3.10-10.fc38.x86_64 fftw-libs-single-3.3.10-10.fc38.x86_64 flatbuffers-23.3.3-1.fc38.x86_64 flatbuffers-compiler-23.3.3-1.fc38.x86_64 flatbuffers-devel-23.3.3-1.fc38.x86_64 flexiblas-3.4.2-1.fc38.x86_64 flexiblas-netlib-3.4.2-1.fc38.x86_64 flexiblas-netlib64-3.4.2-1.fc38.x86_64 flexiblas-openblas-openmp-3.4.2-1.fc38.x86_64 flexiblas-openblas-openmp64-3.4.2-1.fc38.x86_64 fontconfig-2.14.2-2.fc38.x86_64 fonts-filesystem-1:2.0.5-11.fc38.noarch foxi-0-20210526.1.gitc278588e.fc37.x86_64 foxi-devel-0-20210526.1.gitc278588e.fc37.x86_64 fp16-1:0-20240410.0.git581ac1c7.fc38.x86_64 fp16-devel-1:0-20240410.0.git581ac1c7.fc38.x86_64 freetype-2.13.0-2.fc38.x86_64 freexl-1.0.6-21.fc38.x86_64 fribidi-1.0.12-3.fc38.x86_64 fxdiv-devel-1:0-20201208.1.git63058eff.fc38.noarch game-music-emu-0.6.3-11.fc38.x86_64 gc-8.2.2-3.fc38.x86_64 gcc-13.2.1-7.fc38.x86_64 gcc-c++-13.2.1-7.fc38.x86_64 gcc-plugin-annobin-13.2.1-7.fc38.x86_64 gd-2.3.3-10.fc38.x86_64 gdal-libs-3.6.4-2.fc38.x86_64 gdk-pixbuf2-2.42.10-2.fc38.x86_64 gdk-pixbuf2-modules-2.42.10-2.fc38.x86_64 gecode-6.2.0-11.fc38.x86_64 gemmlowp-devel-0-20231104.0.git16e8662c.fc38.noarch geos-3.11.1-3.fc38.x86_64 gflags-2.2.2-11.fc38.x86_64 gflags-devel-2.2.2-11.fc38.x86_64 giflib-5.2.2-1.fc38.x86_64 git-2.44.0-1.fc38.x86_64 git-core-2.44.0-1.fc38.x86_64 git-core-doc-2.44.0-1.fc38.noarch gklib-5.1.1-20230326.0.git8bd6bad7.fc38.x86_64 gl-manpages-1.1-26.20190306.fc38.noarch glib2-2.76.6-1.fc38.x86_64 glibc-devel-2.37-18.fc38.x86_64 glibc-headers-x86-2.37-18.fc38.noarch glog-0.3.5-17.fc38.x86_64 glog-devel-0.3.5-17.fc38.x86_64 gloo-1:0.5.0-20240302.0.git2565674c.cu12_3.fc38.x86_64 gloo-devel-1:0.5.0-20240302.0.git2565674c.cu12_3.fc38.x86_64 glpk-5.0-6.fc38.x86_64 glx-utils-8.5.0-1.fc38.x86_64 gmp-c++-1:6.2.1-4.fc38.x86_64 gmp-devel-1:6.2.1-4.fc38.x86_64 gnutls-3.8.4-1.fc38.x86_64 google-droid-sans-fonts-20200215-15.fc38.noarch google-noto-fonts-common-20230201-2.fc38.noarch google-noto-sans-vf-fonts-20230201-2.fc38.noarch graphene-1.10.6-5.fc38.x86_64 graphite2-1.3.14-11.fc38.x86_64 graphviz-7.1.0-3.fc38.x86_64 groff-base-1.22.4-11.fc38.x86_64 gsl-2.7.1-4.fc38.x86_64 gsm-1.0.22-2.fc38.x86_64 gstreamer1-1.22.9-1.fc38.x86_64 gstreamer1-plugins-base-1.22.9-1.fc38.x86_64 gts-0.7.6-44.20121130.fc38.x86_64 guile22-2.2.7-7.fc38.x86_64 halide-17.0.1-20240220.0.fc38.x86_64 harfbuzz-7.1.0-1.fc38.x86_64 hdf-libs-4.2.15-12.fc38.x86_64 hdf5-1.12.1-11.fc38.x86_64 highway-1.1.0-1.fc38.x86_64 hiredis-1.0.2-4.fc38.x86_64 hiredis-devel-1.0.2-4.fc38.x86_64 hwdata-0.380-1.fc38.noarch ilbc-3.0.4-4.fc38.x86_64 imath-3.1.10-1.fc38.x86_64 infiniband-diags-44.0-3.fc38.x86_64 isl-0.16.1-17.fc38.x86_64 iso-codes-4.13.0-1.fc38.noarch jacop-4.9.0-1.fc38.noarch java-17-openjdk-headless-1:17.0.9.0.9-3.fc38.x86_64 javapackages-filesystem-6.1.0-7.fc38.noarch javapackages-tools-6.1.0-7.fc38.noarch jbig2dec-libs-0.19-8.fc38.x86_64 jbigkit-libs-2.1-25.fc38.x86_64 json-c-0.17-1.fc38.x86_64 jsoncpp-1.9.5-4.fc38.x86_64 kernel-headers-6.8.3-100.fc38.x86_64 keyutils-libs-devel-1.6.3-1.fc38.x86_64 kineto-0.4.0-20240327.0.git445909a8.cu12_3.fc38.x86_64 kineto-devel-0.4.0-20240327.0.git445909a8.cu12_3.fc38.x86_64 kmod-libs-30-4.fc38.x86_64 krb5-devel-1.21-3.fc38.x86_64 lame-libs-3.100-14.fc38.x86_64 langpacks-core-font-en-3.0-32.fc38.noarch lasi-1.1.3-10.fc38.x86_64 lcms2-2.15-1.fc38.x86_64 less-633-1.fc38.x86_64 leveldb-1.23-6.fc38.x86_64 leveldb-devel-1.23-6.fc38.x86_64 libGLEW-2.2.0-4.fc38.x86_64 libICE-1.0.10-10.fc38.x86_64 libSM-1.2.3-12.fc38.x86_64 libX11-1.8.7-1.fc38.x86_64 libX11-common-1.8.7-1.fc38.noarch libX11-devel-1.8.7-1.fc38.x86_64 libX11-xcb-1.8.7-1.fc38.x86_64 libXau-1.0.11-2.fc38.x86_64 libXau-devel-1.0.11-2.fc38.x86_64 libXcursor-1.2.1-3.fc38.x86_64 libXext-1.3.5-2.fc38.x86_64 libXfixes-6.0.0-5.fc38.x86_64 libXft-2.3.8-2.fc38.x86_64 libXi-1.8.1-1.fc38.x86_64 libXpm-3.5.17-1.fc38.x86_64 libXrender-0.9.11-2.fc38.x86_64 libXt-1.2.1-4.fc38.x86_64 libXv-1.0.11-18.fc38.x86_64 libXxf86vm-1.1.5-2.fc38.x86_64 libaec-1.0.6-4.fc38.x86_64 libaom-3.8.2-1.fc38.x86_64 libavcodec-free-6.0.1-2.fc38.x86_64 libavformat-free-6.0.1-2.fc38.x86_64 libavif-0.11.1-7.fc38.x86_64 libavutil-free-6.0.1-2.fc38.x86_64 libb2-0.98.1-8.fc38.x86_64 libbluray-1.3.4-2.fc38.x86_64 libcbor-0.7.0-9.fc38.x86_64 libchromaprint-1.5.1-8.fc38.x86_64 libcom_err-devel-1.46.5-4.fc38.x86_64 libcublas-12-3-12.3.4.1-2.x86_64 libcublas-devel-12-3-12.3.4.1-2.x86_64 libcudnn8-8.9.7.29-2.cuda12.3.x86_64 libcudnn8-devel-8.9.7.29-2.cuda12.3.x86_64 libcufft-12-3-11.0.12.1-2.x86_64 libcufft-devel-12-3-11.0.12.1-2.x86_64 libcurand-12-3-10.3.4.107-1.x86_64 libcurand-devel-12-3-10.3.4.107-1.x86_64 libcusolver-12-3-11.5.4.101-2.x86_64 libcusolver-devel-12-3-11.5.4.101-2.x86_64 libcusparse-12-3-12.2.0.103-2.x86_64 libcusparse-devel-12-3-12.2.0.103-2.x86_64 libdatrie-0.2.13-5.fc38.x86_64 libdav1d-1.2.1-1.fc38.x86_64 libdc1394-2.2.6-9.fc38.x86_64 libdrm-2.4.120-1.fc38.x86_64 libedit-3.1-45.20221030cvs.fc38.x86_64 libevdev-1.13.1-1.fc38.x86_64 libfido2-1.12.0-3.fc38.x86_64 libfontenc-1.1.6-2.fc38.x86_64 libgcrypt-1.10.2-1.fc38.x86_64 libgeotiff-1.7.1-6.fc38.x86_64 libgfortran-13.2.1-7.fc38.x86_64 libglvnd-1:1.6.0-2.fc38.x86_64 libglvnd-core-devel-1:1.6.0-2.fc38.x86_64 libglvnd-devel-1:1.6.0-2.fc38.x86_64 libglvnd-egl-1:1.6.0-2.fc38.x86_64 libglvnd-gles-1:1.6.0-2.fc38.x86_64 libglvnd-glx-1:1.6.0-2.fc38.x86_64 libglvnd-opengl-1:1.6.0-2.fc38.x86_64 libgpg-error-1.47-1.fc38.x86_64 libgs-10.02.1-2.fc38.x86_64 libgta-1.2.1-9.fc38.x86_64 libgudev-237-4.fc38.x86_64 libharu-2.4.3-2.fc38.x86_64 libibumad-44.0-3.fc38.x86_64 libibverbs-44.0-3.fc38.x86_64 libicu-72.1-2.fc38.x86_64 libijs-0.35-17.fc38.x86_64 libimagequant-2.17.0-4.fc38.x86_64 libinput-1.23.0-2.fc38.x86_64 libjpeg-turbo-2.1.4-2.fc38.x86_64 libjxl-1:0.7.0-6.fc38.x86_64 libkadm5-1.21-3.fc38.x86_64 libkml-1.3.0-43.fc38.x86_64 libldb-2.7.2-1.fc38.x86_64 liblerc-4.0.0-3.fc38.x86_64 libmodplug-1:0.8.9.0-16.fc38.x86_64 libmpc-1.3.1-2.fc38.x86_64 libnauty-2.8.6-5.fc38.x86_64 libnccl-2.21.5-1+cuda12.4.x86_64 libnccl-devel-2.21.5-1+cuda12.4.x86_64 libnl3-3.7.0-3.fc38.x86_64 libnpp-12-3-12.2.3.2-2.x86_64 libnvjitlink-12-3-12.3.101-1.x86_64 libnvjitlink-devel-12-3-12.3.101-1.x86_64 libogg-2:1.3.5-5.fc38.x86_64 libopenmpt-0.6.12-1.fc38.x86_64 libpaper-1:2.0.8-1.fc38.x86_64 libpciaccess-0.16-8.fc38.x86_64 libpng-2:1.6.37-14.fc38.x86_64 libpq-15.3-1.fc38.x86_64 libproxy-0.4.18-6.fc38.x86_64 libqhull_r-1:7.2.1-12.fc38.x86_64 libquadmath-13.2.1-7.fc38.x86_64 librabbitmq-0.13.0-1.fc38.x86_64 libraw1394-2.1.2-17.fc38.x86_64 librdmacm-44.0-3.fc38.x86_64 librist-0.2.7-1.fc38.x86_64 librsvg2-2.56.4-1.fc38.x86_64 librttopo-1.1.0-11.fc38.x86_64 libseccomp-2.5.3-4.fc38.x86_64 libselinux-devel-3.5-1.fc38.x86_64 libsepol-devel-3.5-1.fc38.x86_64 libsmbclient-2:4.18.11-1.fc38.x86_64 libsodium-1.0.18-11.fc38.x86_64 libsodium-devel-1.0.18-11.fc38.x86_64 libspatialite-5.0.1-20.fc38.x86_64 libstdc++-devel-13.2.1-7.fc38.x86_64 libswresample-free-6.0.1-2.fc38.x86_64 libswscale-free-6.0.1-2.fc38.x86_64 libtalloc-2.4.0-2.fc38.x86_64 libtdb-1.4.8-1.fc38.x86_64 libtevent-0.14.1-1.fc38.x86_64 libthai-0.1.29-4.fc38.x86_64 libtheora-1:1.1.1-33.fc38.x86_64 libtiff-4.4.0-8.fc38.x86_64 libtool-ltdl-2.4.7-6.fc38.x86_64 libudfread-1.1.2-5.fc38.x86_64 libunwind-1.6.2-7.fc38.x86_64 libunwind-devel-1.6.2-7.fc38.x86_64 liburing-2.4-2.fc38.x86_64 libusb1-1.0.27-1.fc38.x86_64 libuv-1:1.48.0-1.fc38.x86_64 libuv-devel-1:1.48.0-1.fc38.x86_64 libuv-static-1:1.48.0-1.fc38.x86_64 libva-2.18.0-1.fc38.x86_64 libvdpau-1.5-3.fc38.x86_64 libverto-devel-0.3.2-5.fc38.x86_64 libvisual-1:0.4.1-1.fc38.x86_64 libvmaf-2.3.0-5.fc38.x86_64 libvorbis-1:1.3.7-7.fc38.x86_64 libvpl-1:2.10.2-1.fc38.x86_64 libvpx-1.13.1-1.fc38.x86_64 libwacom-2.8.0-1.fc38.x86_64 libwacom-data-2.8.0-1.fc38.noarch libwayland-client-1.22.0-1.fc38.x86_64 libwayland-cursor-1.22.0-1.fc38.x86_64 libwayland-egl-1.22.0-1.fc38.x86_64 libwayland-server-1.22.0-1.fc38.x86_64 libwbclient-2:4.18.11-1.fc38.x86_64 libwebp-1.3.2-2.fc38.x86_64 libxcb-1.13.1-11.fc38.x86_64 libxcb-devel-1.13.1-11.fc38.x86_64 libxcrypt-devel-4.4.36-1.fc38.x86_64 libxkbcommon-1.5.0-2.fc38.x86_64 libxkbcommon-x11-1.5.0-2.fc38.x86_64 libxshmfence-1.3-12.fc38.x86_64 libyaml-0.2.5-9.fc38.x86_64 libzstd-devel-1.5.5-1.fc38.x86_64 lksctp-tools-1.0.19-3.fc38.x86_64 llvm-libs-16.0.6-3.fc38.x86_64 llvm15-libs-15.0.7-4.fc38.x86_64 lmdb-0.9.32-1.fc38.x86_64 lmdb-devel-0.9.32-1.fc38.x86_64 lmdb-libs-0.9.32-1.fc38.x86_64 lpcnetfreedv-0.2-13.fc38.x86_64 lua-5.4.4-9.fc38.x86_64 lua-filesystem-1.8.0-8.fc38.x86_64 lua-json-1.3.4-3.fc38.noarch lua-lpeg-1.0.2-10.fc38.x86_64 lua-posix-35.1-5.fc38.x86_64 lua-term-0.07-17.fc38.x86_64 magma-2.8.0-20240328.0.cu12_3.fc38.x86_64 magma-devel-2.8.0-20240328.0.cu12_3.fc38.x86_64 make-1:4.4.1-1.fc38.x86_64 mariadb-connector-c-3.3.8-1.fc38.x86_64 mariadb-connector-c-config-3.3.8-1.fc38.noarch mbedtls-2.28.7-1.fc38.x86_64 mesa-filesystem-23.1.9-1.fc38.x86_64 mesa-libEGL-23.1.9-1.fc38.x86_64 mesa-libGL-23.1.9-1.fc38.x86_64 mesa-libGLU-9.0.3-1.fc38.x86_64 mesa-libGLU-devel-9.0.3-1.fc38.x86_64 mesa-libgbm-23.1.9-1.fc38.x86_64 mesa-libglapi-23.1.9-1.fc38.x86_64 metis-5.2.1-20230403.0.gite0f1b88b.fc38.x86_64 miniz-3.0.2-2.fc38.x86_64 miniz-devel-3.0.2-2.fc38.x86_64 minizip-ng-3.0.7-4.fc38.x86_64 mkfontscale-1.2.2-3.fc38.x86_64 mockito-3.12.4-6.fc38.noarch mp-3.1.0-41.20200303git7fd4828.fc38.x86_64 mpdecimal-2.5.1-6.fc38.x86_64 mpfr-devel-4.1.1-3.fc38.x86_64 mpg123-libs-1.31.3-1.fc38.x86_64 mtdev-1.1.6-5.fc38.x86_64 ncurses-6.4-7.20230520.fc38.1.x86_64 neon2sse-devel-0-20230131.0.git097a5eca.fc38.noarch netcdf-4.9.0-5.fc38.x86_64 netpbm-11.02.00-1.fc38.x86_64 nettle-3.8-3.fc38.x86_64 nnpack-0-20230201.0.git70a77f48.fc38.x86_64 nnpack-devel-0-20230201.0.git70a77f48.fc38.x86_64 nspr-4.35.0-17.fc38.x86_64 nss-3.99.0-1.fc38.x86_64 nss-softokn-3.99.0-1.fc38.x86_64 nss-softokn-freebl-3.99.0-1.fc38.x86_64 nss-sysinit-3.99.0-1.fc38.x86_64 nss-util-3.99.0-1.fc38.x86_64 numactl-devel-2.0.16-2.fc38.x86_64 numactl-libs-2.0.16-2.fc38.x86_64 objectweb-asm-9.3-5.fc38.noarch objenesis-3.3-2.fc38.noarch ocl-icd-2.3.2-1.fc38.x86_64 ocl-icd-devel-2.3.2-1.fc38.x86_64 ogdi-4.1.0-10.fc38.x86_64 onnx-devel-1.17.0-20240404.0.git4128a090.fc38.x86_64 onnx-libs-1.17.0-20240404.0.git4128a090.fc38.x86_64 onnx-optimizer-0.3.19-20240303.0.gitb3a46118.fc38.x86_64 onnx-optimizer-devel-0.3.19-20240303.0.gitb3a46118.fc38.x86_64 openblas-0.3.21-4.fc38.x86_64 openblas-devel-0.3.21-4.fc38.x86_64 openblas-openmp-0.3.21-4.fc38.x86_64 openblas-openmp64-0.3.21-4.fc38.x86_64 openblas-openmp64_-0.3.21-4.fc38.x86_64 openblas-serial-0.3.21-4.fc38.x86_64 openblas-serial64-0.3.21-4.fc38.x86_64 openblas-serial64_-0.3.21-4.fc38.x86_64 openblas-threads-0.3.21-4.fc38.x86_64 openblas-threads64-0.3.21-4.fc38.x86_64 openblas-threads64_-0.3.21-4.fc38.x86_64 opencl-headers-3.0-18.20231003git9ce9a72.fc38.noarch opencore-amr-0.1.6-3.fc38.x86_64 opencv-4.9.0-20231227.1.cu12_3.fc38.x86_64 opencv-contrib-4.9.0-20231227.1.cu12_3.fc38.x86_64 opencv-core-4.9.0-20231227.1.cu12_3.fc38.x86_64 opencv-cuda-4.9.0-20231227.1.cu12_3.fc38.x86_64 opencv-devel-4.9.0-20231227.1.cu12_3.fc38.x86_64 opencv-static-4.9.0-20231227.1.cu12_3.fc38.x86_64 openexr-libs-3.1.10-1.fc38.x86_64 openjpeg2-2.5.2-1.fc38.x86_64 openpgm-5.2.122-31.fc38.x86_64 openpgm-devel-5.2.122-31.fc38.x86_64 openslide-3.4.1-23.fc38.x86_64 openssh-9.0p1-19.fc38.x86_64 openssh-clients-9.0p1-19.fc38.x86_64 opentest4j-1.2.0-12.fc38.noarch opus-1.3.1-12.fc38.x86_64 orc-0.4.33-2.fc38.x86_64 pango-1.50.14-1.fc38.x86_64 pcre-8.45-1.fc38.3.x86_64 pcre2-devel-10.42-1.fc38.1.x86_64 pcre2-utf16-10.42-1.fc38.1.x86_64 pcre2-utf32-10.42-1.fc38.1.x86_64 peachpy-python3-0-20221113.1.git349e8f83.fc38.noarch perl-AutoLoader-5.74-498.fc38.noarch perl-B-1.83-498.fc38.x86_64 perl-Carp-1.52-490.fc38.noarch perl-Class-Struct-0.66-498.fc38.noarch perl-Data-Dumper-2.184-491.fc38.x86_64 perl-Digest-1.20-490.fc38.noarch perl-Digest-MD5-2.58-490.fc38.x86_64 perl-DynaLoader-1.52-498.fc38.x86_64 perl-Encode-4:3.19-493.fc38.x86_64 perl-Errno-1.36-498.fc38.x86_64 perl-Error-1:0.17029-11.fc38.noarch perl-Exporter-5.77-490.fc38.noarch perl-Fcntl-1.15-498.fc38.x86_64 perl-File-Basename-2.85-498.fc38.noarch perl-File-Find-1.40-498.fc38.noarch perl-File-Path-2.18-490.fc38.noarch perl-File-Temp-1:0.231.100-490.fc38.noarch perl-File-stat-1.12-498.fc38.noarch perl-FileHandle-2.03-498.fc38.noarch perl-Getopt-Long-1:2.54-2.fc38.noarch perl-Getopt-Std-1.13-498.fc38.noarch perl-Git-2.44.0-1.fc38.noarch perl-HTTP-Tiny-0.086-2.fc38.noarch perl-IO-1.50-498.fc38.x86_64 perl-IO-Socket-IP-0.41-492.fc38.noarch perl-IO-Socket-SSL-2.081-1.fc38.noarch perl-IPC-Open3-1.22-498.fc38.noarch perl-MIME-Base64-3.16-490.fc38.x86_64 perl-Mozilla-CA-20221114-2.fc38.noarch perl-Net-SSLeay-1.92-5.fc38.x86_64 perl-POSIX-2.03-498.fc38.x86_64 perl-PathTools-3.84-490.fc38.x86_64 perl-Pod-Escapes-1:1.07-490.fc38.noarch perl-Pod-Perldoc-3.28.01-491.fc38.noarch perl-Pod-Simple-1:3.43-491.fc38.noarch perl-Pod-Usage-4:2.03-4.fc38.noarch perl-Scalar-List-Utils-5:1.63-490.fc38.x86_64 perl-SelectSaver-1.02-498.fc38.noarch perl-Socket-4:2.036-2.fc38.x86_64 perl-Storable-1:3.26-490.fc38.x86_64 perl-Symbol-1.09-498.fc38.noarch perl-Term-ANSIColor-5.01-491.fc38.noarch perl-Term-Cap-1.18-1.fc38.noarch perl-TermReadKey-2.38-16.fc38.x86_64 perl-Text-ParseWords-3.31-490.fc38.noarch perl-Text-Tabs+Wrap-2023.0511-1.fc38.noarch perl-Time-Local-2:1.300-490.fc38.noarch perl-URI-5.17-2.fc38.noarch perl-base-2.27-498.fc38.noarch perl-constant-1.33-491.fc38.noarch perl-if-0.61.000-498.fc38.noarch perl-interpreter-4:5.36.3-498.fc38.x86_64 perl-lib-0.65-498.fc38.x86_64 perl-libnet-3.15-1.fc38.noarch perl-libs-4:5.36.3-498.fc38.x86_64 perl-locale-1.10-498.fc38.noarch perl-mro-1.26-498.fc38.x86_64 perl-overload-1.35-498.fc38.noarch perl-overloading-0.02-498.fc38.noarch perl-parent-1:0.241-1.fc38.noarch perl-podlators-1:5.01-2.fc38.noarch perl-vars-1.05-498.fc38.noarch pixman-0.42.2-1.fc38.x86_64 poppler-23.02.0-3.fc38.x86_64 poppler-data-0.4.11-4.fc38.noarch poppler-glib-23.02.0-3.fc38.x86_64 procps-ng-3.3.17-11.fc38.x86_64 proj-9.1.1-1.fc38.x86_64 proj-data-9.1.1-1.fc38.noarch protobuf-3.19.6-2.fc38.x86_64 protobuf-compat-3.21.9-2.fc38.x86_64 protobuf-compat-compiler-3.21.9-2.fc38.x86_64 protobuf-compat-devel-3.21.9-2.fc38.x86_64 psimd-devel-1:0-20200517.2.git072586a7.fc38.noarch pthreadpool-1:0.1-20240121.0.git178e3e06.fc38.x86_64 pthreadpool-devel-1:0.1-20240121.0.git178e3e06.fc38.x86_64 pugixml-1.13-2.fc38.x86_64 pybind11-devel-2.10.3-2.fc38.x86_64 pyproject-rpm-macros-1.12.0-1.fc38.noarch python-pip-wheel-22.3.1-3.fc38.noarch python-rpm-macros-3.11-10.fc38.noarch python-setuptools-wheel-65.5.1-2.fc38.noarch python3-3.11.8-2.fc38.x86_64 python3-devel-3.11.8-2.fc38.x86_64 python3-libs-3.11.8-2.fc38.x86_64 python3-numpy-1:1.24.4-1.fc38.x86_64 python3-packaging-23.0-1.fc38.noarch python3-pybind11-2.10.3-2.fc38.x86_64 python3-pyyaml-6.0-6.fc38.x86_64 python3-rpm-generators-14-4.fc38.noarch python3-rpm-macros-3.11-10.fc38.noarch python3-setuptools-65.5.1-2.fc38.noarch python3-six-1.16.0-9.fc38.noarch python3-typing-extensions-4.5.0-1.fc38.noarch qnnpack-0-20190828.2.git7d2a4e99.fc38.x86_64 qnnpack-devel-0-20190828.2.git7d2a4e99.fc38.x86_64 qt-settings-38.3-1.fc38.noarch qt5-qtbase-5.15.12-5.fc38.x86_64 qt5-qtbase-common-5.15.12-5.fc38.noarch qt5-qtbase-gui-5.15.12-5.fc38.x86_64 rav1e-libs-0.7.1-1.fc38.x86_64 rdma-core-devel-44.0-3.fc38.x86_64 rhash-1.4.3-2.fc38.x86_64 rocksdb-7.8.3-1.fc38.x86_64 rocksdb-devel-7.8.3-1.fc38.x86_64 samba-client-libs-2:4.18.11-1.fc38.x86_64 samba-common-2:4.18.11-1.fc38.noarch samba-common-libs-2:4.18.11-1.fc38.x86_64 scotch-6.1.2-3.fc37.x86_64 shared-mime-info-2.2-3.fc38.x86_64 sleef-3.6-20240320.0.git60e76d2b.fc38.x86_64 sleef-devel-3.6-20240320.0.git60e76d2b.fc38.x86_64 snappy-1.1.9-7.fc38.x86_64 snappy-devel-1.1.9-7.fc38.x86_64 soxr-0.1.3-13.fc38.x86_64 speex-1.2.0-13.fc38.x86_64 srt-libs-1.5.2-1.fc38.x86_64 suitesparse-5.13.0-2.fc38.x86_64 svt-av1-libs-1.4.1-2.fc38.x86_64 systemd-253.17-1.fc38.x86_64 systemd-pam-253.17-1.fc38.x86_64 systemd-rpm-macros-253.17-1.fc38.noarch tbb-2020.3-16.fc38.x86_64 tbb-devel-2020.3-16.fc38.x86_64 tcl-1:8.6.12-4.fc38.x86_64 tensorpipe-0-20220513.1.gitbb1473a4.fc37.x86_64 tensorpipe-devel-0-20220513.1.gitbb1473a4.fc37.x86_64 twolame-libs-0.4.0-2.fc38.x86_64 tzdata-java-2024a-1.fc38.noarch unixODBC-2.3.11-2.fc38.x86_64 uriparser-0.9.7-2.fc38.x86_64 urw-base35-bookman-fonts-20200910-16.fc38.noarch urw-base35-c059-fonts-20200910-16.fc38.noarch urw-base35-d050000l-fonts-20200910-16.fc38.noarch urw-base35-fonts-20200910-16.fc38.noarch urw-base35-fonts-common-20200910-16.fc38.noarch urw-base35-gothic-fonts-20200910-16.fc38.noarch urw-base35-nimbus-mono-ps-fonts-20200910-16.fc38.noarch urw-base35-nimbus-roman-fonts-20200910-16.fc38.noarch urw-base35-nimbus-sans-fonts-20200910-16.fc38.noarch urw-base35-p052-fonts-20200910-16.fc38.noarch urw-base35-standard-symbols-ps-fonts-20200910-16.fc38.noarch urw-base35-z003-fonts-20200910-16.fc38.noarch vapoursynth-libs-58-4.fc38.x86_64 vim-filesystem-2:9.1.264-1.fc38.noarch vo-amrwbenc-0.1.3-18.fc38.x86_64 vtk-9.2.5-2.fc38.x86_64 xapian-core-libs-1.4.23-1.fc38.x86_64 xcb-util-0.4.1-2.fc38.x86_64 xcb-util-image-0.4.1-2.fc38.x86_64 xcb-util-keysyms-0.4.1-2.fc38.x86_64 xcb-util-renderutil-0.3.10-2.fc38.x86_64 xcb-util-wm-0.4.2-2.fc38.x86_64 xerces-c-3.2.5-1.fc38.x86_64 xkeyboard-config-2.38-1.fc38.noarch xml-common-0.6.3-60.fc38.noarch xorg-x11-fonts-ISO8859-1-100dpi-7.5-35.fc38.noarch xorg-x11-proto-devel-2022.2-3.fc38.noarch xvidcore-1.3.7-9.fc38.x86_64 zeromq-4.3.4-5.fc38.x86_64 zeromq-devel-4.3.4-5.fc38.x86_64 zimg-3.0.5-1.fc38.x86_64 zlib-devel-1.2.13-3.fc38.x86_64 zvbi-0.2.35-19.fc38.x86_64 Complete! Finish: build setup for pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm Start: rpmbuild pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm warning: %patchN is deprecated (2 usages found), use %patch N (or %patch -P N) Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1554595200 Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.ec8anC + umask 022 + cd /builddir/build/BUILD + cd /builddir/build/BUILD + rm -rf pytorch + /usr/bin/mkdir -p pytorch + cd pytorch + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + git clone --depth 1 -n -b main https://github.com/pytorch/pytorch.git . Cloning into '.'... + git fetch --depth 1 origin 7efaf54dc46034189cb36b345764a5a9a5b693d4 From https://github.com/pytorch/pytorch * branch 7efaf54dc46034189cb36b345764a5a9a5b693d4 -> FETCH_HEAD + git reset --hard 7efaf54dc46034189cb36b345764a5a9a5b693d4 Updating files: 100% (18485/18485), done. HEAD is now at 7efaf54 Fakeifying views shouldnt create symbols when dynamic=False (#123348) + git submodule update --init --depth 1 third_party/fmt Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' Cloning into '/builddir/build/BUILD/pytorch/third_party/fmt'... From https://github.com/fmtlib/fmt * branch e69e5f977d458f2650bb346dadf2ad30c5320281 -> FETCH_HEAD Submodule path 'third_party/fmt': checked out 'e69e5f977d458f2650bb346dadf2ad30c5320281' + git submodule update --init --depth 1 third_party/XNNPACK Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' Cloning into '/builddir/build/BUILD/pytorch/third_party/XNNPACK'... From https://github.com/google/XNNPACK * branch fcbf55af6cf28a4627bcd1f703ab7ad843f0f3a2 -> FETCH_HEAD Submodule path 'third_party/XNNPACK': checked out 'fcbf55af6cf28a4627bcd1f703ab7ad843f0f3a2' + git submodule update --init --depth 1 third_party/ittapi Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' Cloning into '/builddir/build/BUILD/pytorch/third_party/ittapi'... From https://github.com/intel/ittapi * branch 5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42 -> FETCH_HEAD Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' + git submodule update --init --depth 1 third_party/pocketfft Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' Cloning into '/builddir/build/BUILD/pytorch/third_party/pocketfft'... From https://github.com/mreineck/pocketfft * branch 9d3ab05a7fffbc71a492bc6a17be034e83e8f0fe -> FETCH_HEAD Submodule path 'third_party/pocketfft': checked out '9d3ab05a7fffbc71a492bc6a17be034e83e8f0fe' + git submodule update --init --depth 1 third_party/cudnn_frontend Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' Cloning into '/builddir/build/BUILD/pytorch/third_party/cudnn_frontend'... From https://github.com/NVIDIA/cudnn-frontend * branch 150798fe976556078f443fdb059a1ff0361f58a2 -> FETCH_HEAD Submodule path 'third_party/cudnn_frontend': checked out '150798fe976556078f443fdb059a1ff0361f58a2' + git --no-pager log --format=fuller commit 7efaf54dc46034189cb36b345764a5a9a5b693d4 Author: Brian Hirsh AuthorDate: Thu Apr 11 08:19:28 2024 -0700 Commit: PyTorch MergeBot CommitDate: Fri Apr 12 01:12:23 2024 +0000 Fakeifying views shouldnt create symbols when dynamic=False (#123348) Fixes https://github.com/pytorch/pytorch/issues/123298 I was also seeing some crashes in torchtrain due to dynamic shapes, even when I set `compile(dynamic=False)` (cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @wanchaol). This doesn't fix the underlying dynamic shape issues with compile + DTensor, but it does prevent dynamic shapes from leaking in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123348 Approved by: https://github.com/ezyang ghstack dependencies: #122502, #122751 Patch #1 (pytorch-C.patch): + echo 'Patch #1 (pytorch-C.patch):' + /usr/bin/patch --no-backup-if-mismatch -f -p0 -b --suffix .python~ --fuzz=100 patching file torch/CMakeLists.txt Hunk #1 succeeded at 277 (offset -2 lines). Patch #5 (pytorch-cuda12.patch): + echo 'Patch #5 (pytorch-cuda12.patch):' + /usr/bin/patch --no-backup-if-mismatch -f -p1 -b --suffix .cu12~ --fuzz=100 patching file aten/src/ATen/native/nested/cuda/NestedTensorMatmul.cu patching file aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cu patching file aten/src/ATen/native/transformers/cuda/attention.cu Hunk #1 succeeded at 1 with fuzz 3. patching file aten/src/ATen/native/transformers/cuda/attention_backward.cu Hunk #1 succeeded at 1 with fuzz 3. patching file aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernel_backward.h Hunk #1 succeeded at 1 with fuzz 3. patching file aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernel_forward.h Hunk #1 succeeded at 1 with fuzz 3. patching file aten/src/ATen/native/transformers/cuda/flash_attn/flash_bwd_launch_template.h Hunk #1 succeeded at 1 with fuzz 3. patching file aten/src/ATen/native/transformers/cuda/flash_attn/flash_fwd_launch_template.h Hunk #1 succeeded at 1 with fuzz 3. + sed -i -e 's|VERSION_LESS 3.7)|VERSION_LESS 3.6)|g' cmake/Dependencies.cmake + sed -i -e 's|PY_MAJOR_VERSION == 3|PY_MAJOR_VERSION == 3 \&\& PY_MINOR_VERSION > 6|' torch/csrc/dynamo/eval_frame.c + sed -i 's|CMAKE_CXX_STANDARD 14|CMAKE_CXX_STANDARD 17|' CMakeLists.txt + sed -i -e 's|torch_cpu PUBLIC c10|torch_cpu PUBLIC c10 qnnpack gloo gloo_cuda|' caffe2/CMakeLists.txt + sed -i -e 's|USE_SYSTEM_BIND11|USE_SYSTEM_PYBIND11|g' cmake/Dependencies.cmake + rm -rf 'third_party/pthreadpool/*' + touch third_party/pthreadpool/CMakeLists.txt + sed -i -e 's|NAMES openblas|NAMES openblaso openblas|' cmake/Modules/FindOpenBLAS.cmake + sed -i -e 's|USE_ZSTD|NOT_USE_ZSTD|g' cmake/Dependencies.cmake + sed -i -e 's|add_subdirectory(zstd)|list(APPEND Caffe2_PUBLIC_DEPENDENCY_LIBS zstd)|g' caffe2/share/contrib/CMakeLists.txt + sed -i -e 's|Caffe2_DEPENDENCY_LIBS onnx_proto onnx|Caffe2_DEPENDENCY_LIBS onnx_proto onnx onnx_optimizer|' cmake/Dependencies.cmake + mkdir -p third_party/tensorpipe + echo '' + sed -i '/add_dependencies(tensorpipe_agent tensorpipe)/d' caffe2/CMakeLists.txt + echo '' + echo 'set(NNPACK_FOUND TRUE)' + sed -i '/TARGET cpuinfo PROPERTY/d' cmake/Dependencies.cmake + sed -i '/APPEND Caffe2_DEPENDENCY_LIBS fp16/d' cmake/Dependencies.cmake + mkdir -p third_party/QNNPACK + echo '' + sed -i '/TARGET qnnpack PROPERTY/d' cmake/Dependencies.cmake + sed -i -e '/target_compile_options(qnnpack/d' cmake/Dependencies.cmake + mkdir -p third_party/psimd + echo '' + sed -i '/pytorch_qnnpack PRIVATE psimd/d' aten/src/ATen/native/quantized/cpu/qnnpack/CMakeLists.txt + sed -i '/NOT TARGET fxdiv/,/endif/d' caffe2/CMakeLists.txt + sed -i '/torch_cpu PRIVATE fxdiv/d' caffe2/CMakeLists.txt + sed -i '/pytorch_qnnpack PRIVATE fxdiv/d' aten/src/ATen/native/quantized/cpu/qnnpack/CMakeLists.txt + mkdir -p third_party/fbgemm + echo '' + sed -i '/(TARGET fbgemm/d' cmake/Dependencies.cmake + sed -i 's|caffe2_fakelowp_ops fbgemm cpuinfo|caffe2_fakelowp_ops|' caffe2/contrib/fakelowp/CMakeLists.txt + sed -i 's|caffe2_dnnlowp_avx2_ops fbgemm|caffe2_dnnlowp_avx2_ops|' caffe2/quantization/server/CMakeLists.txt + mkdir -p third_party/foxi + echo '' + sed -i '/if(NOT TARGET kineto)/,/endif()/d' cmake/Dependencies.cmake + sed -i 's|libkineto/include|libkineto/include\n/usr/include/kineto|' torch/CMakeLists.txt + sed -i 's|libkineto/include|libkineto/include\n/usr/include/kineto|' caffe2/CMakeLists.txt + mkdir -p third_party/onnx-tensorrt + echo '' + sed -i /nvonnxparser_static/d cmake/Dependencies.cmake + sed -i 's|onnx_trt_library|nvonnxparser_static|g' cmake/Dependencies.cmake + rm -rf torch/csrc/jit/serialization/mobile_bytecode_generated.h + flatc --cpp --gen-mutable --scoped-enums -o torch/csrc/jit/serialization -c torch/csrc/jit/serialization/mobile_bytecode.fbs + echo '// @generated' + sed -i '/find_package(RocksDB CONFIG)/d' modules/rocksdb/CMakeLists.txt + sed -i 's|RocksDB::rocksdb|RocksDB::rocksdb-shared|' modules/rocksdb/CMakeLists.txt + mv -f cmake/Modules_CUDA_fix/FindCUDNN.cmake cmake/Modules + rm -rf cmake/Modules_CUDA_fix + find . -type d -name FindCUDA -exec rm -rf '{}' ';' + sed -i -e '/install/{:a;/COMPONENT/bb;N;ba;:b;/Modules_CUDA_fix/d;}' CMakeLists.txt + sed -i -e 's|CMAKE_CUDA_FLAGS "-D|CMAKE_CUDA_FLAGS " -D|' CMakeLists.txt + sed -i '/install(EXPORT Caffe2Targets/,/dev)/d' CMakeLists.txt + sed -i 's|SYSTEM ||g' c10/CMakeLists.txt + sed -i 's|SYSTEM ||g' torch/CMakeLists.txt + sed -i 's|SYSTEM ||g' caffe2/CMakeLists.txt + sed -i 's|BEFORE SYSTEM ||g' cmake/ProtoBuf.cmake + sed -i 's|AFTER SYSTEM ||g' cmake/Dependencies.cmake + sed -i 's|BEFORE SYSTEM ||g' cmake/Dependencies.cmake + sed -i 's|SYSTEM ||g' cmake/Dependencies.cmake + sed -i '1i #include ' c10/util/Registry.h + sed -i '1i #include ' c10/core/DispatchKey.h + sed -i '1i #include ' torch/csrc/jit/runtime/logging.cpp + sed -i '1i #include ' torch/csrc/lazy/core/multi_wait.cpp + sed -i '1i #include "stdint.h"' torch/csrc/jit/passes/quantization/quantization_type.h + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.zEKXch + umask 022 + cd /builddir/build/BUILD + CFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 ' + export CFLAGS + CXXFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 ' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Wl,-lstdc++' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd pytorch + mkdir build + pushd build ~/build/BUILD/pytorch/build ~/build/BUILD/pytorch + export ONNX_ML=0 + ONNX_ML=0 + export BUILD_SPLIT_CUDA=ON + BUILD_SPLIT_CUDA=ON + export REL_WITH_DEB_INFO=1 + REL_WITH_DEB_INFO=1 + export TORCH_NVCC_FLAGS=-DCUDA_HAS_FP16 + TORCH_NVCC_FLAGS=-DCUDA_HAS_FP16 + export PYTHON_EXECUTABLE=/usr/bin/python3 + PYTHON_EXECUTABLE=/usr/bin/python3 + export LDFLAGS=-Wl,-lstdc++ + LDFLAGS=-Wl,-lstdc++ + export LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64/ + LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64/ + CFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 ' + export CFLAGS + CXXFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 ' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + LDFLAGS=-Wl,-lstdc++ + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + /usr/bin/cmake -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON .. -Wno-dev -DCMAKE_SKIP_RPATH=ON -DCMAKE_VERBOSE_MAKEFILE=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_NO_SYSTEM_FROM_IMPORTED=ON -DCMAKE_SKIP_RULE_DEPENDENCY=ON -DCMAKE_SUPPRESS_REGENERATION=ON -DUSE_CCACHE=OFF -DHAVE_SOVERSION=ON -DUSE_NATIVE_ARCH=OFF -DUSE_DISTRIBUTED=ON -DBUILD_DOCS=OFF -DBUILD_PYTHON=ON -DBUILD_FUNCTORCH=ON -DBUILD_CAFFE2=OFF -DBUILD_BINARY=OFF -DBUILD_BENCHMARK=OFF -DBUILD_CUSTOM_PROTOBUF=OFF -DBUILDING_WITH_TORCH_LIBS=ON -DPYTHON_EXECUTABLE=/usr/bin/python3 -DPYBIND11_PYTHON_VERSION=3.11 -DCAFFE2_LINK_LOCAL_PROTOBUF=OFF -DONNX_ML=OFF -DUSE_GLOG=ON -DUSE_GFLAGS=ON -DUSE_OPENMP=ON -DUSE_KINETO=ON -DUSE_BREAKPAD=OFF -DUSE_SYSTEM_ONNX=ON -DUSE_SYSTEM_GLOO=ON -DUSE_SYSTEM_PYBIND11=ON -DUSE_SYSTEM_EIGEN_INSTALL=ON -DUSE_CUDA=ON -DUSE_CUDNN=ON -DUSE_NVRTC=ON -DUSE_CUPTI_SO=ON -DUSE_FAST_NVCC=ON -DUSE_SYSTEM_NCCL=ON -DCMAKE_CUDA_FLAGS=-fPIC -DCUDA_PROPAGATE_HOST_FLAGS=OFF '-DTORCH_CUDA_ARCH_LIST=5.2+PTX 6.1 7.5 8.6 8.9 9.0' -DCUDA_HOST_COMPILER=/usr/bin/cuda-g++ -DCMAKE_CUDA_HOST_COMPILER=/usr/bin/cuda-g++ -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.3 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.3/bin/nvcc '-DCUDA_NVCC_FLAGS=--compiler-options;-fPIC;-Wno-deprecated-gpu-targets;-allow-unsupported-compiler;--fatbin-options;-compress-all' '-DCMAKE_CUDA_FLAGS=--compiler-options -fPIC -Wno-deprecated-gpu-targets -allow-unsupported-compiler --fatbin-options -compress-all' -DNCCL_INCLUDE_DIR=/usr/include/nccl -DUSE_MAGMA=ON -DBUILD_SPLIT_CUDA=ON -DUSE_TENSORRT=OFF -DBLAS=OpenBLAS -DUSE_MPI=OFF -DUSE_OBSERVERS=OFF -DUSE_ASAN=OFF -DUSE_ROCM=OFF -DUSE_MKLDNN=OFF -DUSE_FBGEMM=ON -DUSE_NNPACK=ON -DUSE_QNNPACK=ON -DUSE_PYTORCH_QNNPACK=ON -DUSE_SYSTEM_FP16=ON -DUSE_SYSTEM_PSIMD=ON -DUSE_SYSTEM_SLEEF=ON -DUSE_SYSTEM_FXDIV=ON -DUSE_SYSTEM_XNNPACK=OFF -DUSE_SYSTEM_CPUINFO=ON -DUSE_SYSTEM_PTHREADPOOL=ON -DUSE_TENSORPIPE=ON -DUSE_FAKELOWP=OFF -DUSE_OPENCL=OFF -DUSE_GLOO=ON -DUSE_ZMQ=ON -DUSE_ZSTD=ON -DUSE_LMDB=ON -DUSE_REDIS=ON -DUSE_LEVELDB=ON -DUSE_ROCKSDB=ON -DUSE_FFMPEG=OFF -DUSE_OPENCV=ON -DUSE_METAL=OFF -DUSE_TBB=OFF -DUSE_LLVM=OFF -DATEN_NO_TEST=ON -- The CXX compiler identification is GNU 13.2.1 -- The C compiler identification is GNU 13.2.1 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- /usr/bin/g++ /builddir/build/BUILD/pytorch/torch/abi-check.cpp -o /builddir/build/BUILD/pytorch/build/abi-check -- Determined _GLIBCXX_USE_CXX11_ABI=1 -- Performing Test CAFFE2_NEED_TO_TURN_OFF_DEPRECATION_WARNING -- Performing Test CAFFE2_NEED_TO_TURN_OFF_DEPRECATION_WARNING - Failed -- Turning off deprecation warning due to glog. -- Performing Test C_HAS_AVX_1 -- Performing Test C_HAS_AVX_1 - Failed -- Performing Test C_HAS_AVX_2 -- Performing Test C_HAS_AVX_2 - Success -- Performing Test C_HAS_AVX2_1 -- Performing Test C_HAS_AVX2_1 - Failed -- Performing Test C_HAS_AVX2_2 -- Performing Test C_HAS_AVX2_2 - Success -- Performing Test C_HAS_AVX512_1 -- Performing Test C_HAS_AVX512_1 - Failed -- Performing Test C_HAS_AVX512_2 -- Performing Test C_HAS_AVX512_2 - Success -- Performing Test CXX_HAS_AVX_1 -- Performing Test CXX_HAS_AVX_1 - Failed -- Performing Test CXX_HAS_AVX_2 -- Performing Test CXX_HAS_AVX_2 - Success -- Performing Test CXX_HAS_AVX2_1 -- Performing Test CXX_HAS_AVX2_1 - Failed -- Performing Test CXX_HAS_AVX2_2 -- Performing Test CXX_HAS_AVX2_2 - Success -- Performing Test CXX_HAS_AVX512_1 -- Performing Test CXX_HAS_AVX512_1 - Failed -- Performing Test CXX_HAS_AVX512_2 -- Performing Test CXX_HAS_AVX512_2 - Success -- Current compiler supports avx2 extension. Will build perfkernels. -- Performing Test CAFFE2_COMPILER_SUPPORTS_AVX512_EXTENSIONS -- Performing Test CAFFE2_COMPILER_SUPPORTS_AVX512_EXTENSIONS - Success -- Current compiler supports avx512f extension. Will build fbgemm. -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY - Success -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY - Success -- Performing Test COMPILER_SUPPORTS_RDYNAMIC -- Performing Test COMPILER_SUPPORTS_RDYNAMIC - Success -- Found CUDA: /usr/local/cuda-12.3 (found version "12.3") -- The CUDA compiler identification is NVIDIA 12.3.107 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /usr/local/cuda-12.3/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Found CUDAToolkit: /usr/local/cuda-12.3/include (found version "12.3.107") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Caffe2: CUDA detected: 12.3 -- Caffe2: CUDA nvcc is: /usr/local/cuda-12.3/bin/nvcc -- Caffe2: CUDA toolkit directory: /usr/local/cuda-12.3 -- Caffe2: Header version is: 12.3 -- /usr/local/cuda-12.3/lib64/libnvrtc.so shorthash is e150bf88 -- Found CUDNN: /usr/lib64/libcudnn.so -- Could NOT find CUSPARSELT (missing: CUSPARSELT_LIBRARY_PATH CUSPARSELT_INCLUDE_PATH) CMake Warning at cmake/public/cuda.cmake:275 (message): Cannot find cuSPARSELt library. Turning the option off Call Stack (most recent call first): cmake/Dependencies.cmake:44 (include) CMakeLists.txt:760 (include) -- Added CUDA NVCC flags for: -gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_52,code=compute_52 -- Caffe2: Found protobuf with new-style protobuf targets. -- Caffe2 protobuf include directory: /usr/include -- Trying to find preferred BLAS backend of choice: OpenBLAS -- Found OpenBLAS libraries: /usr/lib64/libopenblaso.so -- Found OpenBLAS include: /usr/include/openblas -- Using pocketfft in directory: /builddir/build/BUILD/pytorch/third_party/pocketfft/ -- Found pthreadpool: /usr/lib64/libpthreadpool.so Found cpuinfo: /usr/lib64/libcpuinfo.so -- The ASM compiler identification is GNU -- Found assembler: /usr/bin/gcc -- Caffe2: Found gflags with new-style gflags target. -- Caffe2: Cannot find glog automatically. Using legacy find. -- Found glog: /usr/include -- Caffe2: Found glog (include: /usr/include, library: /usr/lib64/libglog.so) -- Found LMDB: /usr/include -- Found lmdb (include: /usr/include, library: /usr/lib64/liblmdb.so) -- Found LevelDB: /usr/include -- Found LevelDB (include: /usr/include, library: /usr/lib64/libleveldb.so) -- Found Snappy: /usr/include -- Found Snappy (include: /usr/include, library: /usr/lib64/libsnappy.so) -- Found Numa: /usr/include -- Found Numa (include: /usr/include, library: /usr/lib64/libnuma.so) -- Found ZMQ: /usr/include -- Found ZMQ (include: /usr/include, library: /usr/lib64/libzmq.so) -- Found Hiredis: /usr/include -- Found Hiredis (include: /usr/include, library: /usr/lib64/libhiredis.so) -- OpenCV found (/usr/lib64/cmake/opencv4) -- Found system Eigen at /usr/include/eigen3 -- Setting Python's include dir to /usr/include/python3.11 from sysconfig -- Setting Python's library to /usr/lib64/python3.11 -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.11.8", minimum required is "3.0") -- Found PythonLibs: /usr/lib64/python3.11 (found suitable version "3.11.8", minimum required is "3.0") -- Found NumPy: /usr/lib64/python3.11/site-packages/numpy/core/include (found version "1.24.4") -- NumPy ver. 1.24.4 found (include: /usr/lib64/python3.11/site-packages/numpy/core/include) -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.11.8", minimum required is "3.11") -- Found PythonLibs: /usr/lib64/python3.11 -- Performing Test HAS_FLTO -- Performing Test HAS_FLTO - Success -- Found pybind11: /usr/include (found version "2.10.3") -- pybind11 include dirs: /usr/include;/usr/include/python3.11 -- Check OMP with lib /usr/lib/gcc/x86_64-redhat-linux/13/libgomp.so and flags -fopenmp -v -- Check OMP with lib /usr/lib/gcc/x86_64-redhat-linux/13/libgomp.so and flags -fopenmp -v -- Found OpenMP_C: -fopenmp (found version "4.5") -- Found OpenMP_CXX: -fopenmp (found version "4.5") -- Found OpenMP: TRUE (found version "4.5") -- Adding OpenMP CXX_FLAGS: -fopenmp -- Will link against OpenMP libraries: /usr/lib/gcc/x86_64-redhat-linux/13/libgomp.so -- Found NCCL: /usr/include -- Determining NCCL version from /usr/include/nccl.h... -- Looking for NCCL_VERSION_CODE -- Looking for NCCL_VERSION_CODE - not found -- NCCL version < 2.3.5-5 -- Found NCCL (include: /usr/include, library: /usr/lib64/libnccl.so) -- Found CUB: /usr/local/cuda-12.3/include -- Converting CMAKE_CUDA_FLAGS to CUDA_NVCC_FLAGS: CUDA_NVCC_FLAGS = --compiler-options;-fPIC;-Wno-deprecated-gpu-targets;-allow-unsupported-compiler;--fatbin-options;-compress-all;-DLIBCUDACXX_ENABLE_SIMPLIFIED_COMPLEX_OPERATIONS;-D_GLIBCXX_USE_CXX11_ABI=1;-Xfatbin;-compress-all;--compiler-options;-fPIC;-Wno-deprecated-gpu-targets;-allow-unsupported-compiler;--fatbin-options;-compress-all;-DONNX_NAMESPACE=onnx;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_52,code=compute_52;-Xcudafe;--diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl;--expt-relaxed-constexpr;--expt-extended-lambda CUDA_NVCC_FLAGS_DEBUG = -g CUDA_NVCC_FLAGS_RELEASE = -O3;-DNDEBUG CUDA_NVCC_FLAGS_RELWITHDEBINFO = -O2;-g;-DNDEBUG CUDA_NVCC_FLAGS_MINSIZEREL = -O1;-DNDEBUG Found gloo: /usr/lib64/libgloo.so -- Found onnx: /usr/lib64/libonnx.so /usr/lib64/libonnx_proto.so -- Found CUDA with FP16 support, compiling with torch.cuda.HalfTensor -- Adding -DNDEBUG to compile flags -- Checking prototype magma_get_sgeqrf_nb for MAGMA_V2 -- Checking prototype magma_get_sgeqrf_nb for MAGMA_V2 - False -- Compiling with MAGMA support -- MAGMA INCLUDE DIRECTORIES: /usr/include -- MAGMA LIBRARIES: /usr/lib64/libmagma.so -- MAGMA V2 check: 0 -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- Looking for cheev_ -- Looking for cheev_ - found -- Looking for cgesdd_ -- Looking for cgesdd_ - found -- Found a library with LAPACK API (open). disabling ROCM because NOT USE_ROCM is set -- MIOpen not found. Compiling without MIOpen support disabling MKLDNN because USE_MKLDNN is not set -- Looking for clock_gettime in rt -- Looking for clock_gettime in rt - found -- Looking for mmap -- Looking for mmap - found -- Looking for shm_open -- Looking for shm_open - found -- Looking for shm_unlink -- Looking for shm_unlink - found -- Looking for malloc_usable_size -- Looking for malloc_usable_size - found -- -- check z16 -- Performing Test COMPILE_OUT_z16 -- Performing Test COMPILE_OUT_z16 - Failed -- Performing Test COMPILE_OUT_z15 -- check z15 -- Performing Test COMPILE_OUT_z15 - Failed -- Performing Test COMPILE_OUT_z14 -- check z14 -- Performing Test COMPILE_OUT_z14 - Failed -- -- Version: 10.2.1 -- Build type: Release -- Using Kineto with CUPTI support -- Configuring Kineto dependency: -- KINETO_SOURCE_DIR = /builddir/build/BUILD/pytorch/third_party/kineto/libkineto -- KINETO_BUILD_TESTS = OFF -- KINETO_LIBRARY_TYPE = static -- CUDA_SOURCE_DIR = /usr/local/cuda-12.3 -- CUDA_INCLUDE_DIRS = /usr/local/cuda-12.3/include -- CUPTI_INCLUDE_DIR = /usr/local/cuda-12.3/include -- CUDA_cupti_LIBRARY = /usr/local/cuda-12.3/lib64/libcupti.so -- Found CUPTI -- Configured Kineto -- GCC 13.2.1: Adding gcc and gcc_s libs to link line -- Performing Test HAS_WERROR_RETURN_TYPE -- Performing Test HAS_WERROR_RETURN_TYPE - Success -- Performing Test HAS_WERROR_NON_VIRTUAL_DTOR -- Performing Test HAS_WERROR_NON_VIRTUAL_DTOR - Success -- Performing Test HAS_WERROR_BRACED_SCALAR_INIT -- Performing Test HAS_WERROR_BRACED_SCALAR_INIT - Failed -- Performing Test HAS_WERROR_RANGE_LOOP_CONSTRUCT -- Performing Test HAS_WERROR_RANGE_LOOP_CONSTRUCT - Success -- Performing Test HAS_WERROR_BOOL_OPERATION -- Performing Test HAS_WERROR_BOOL_OPERATION - Success -- Performing Test HAS_WNARROWING -- Performing Test HAS_WNARROWING - Success -- Performing Test HAS_WNO_MISSING_FIELD_INITIALIZERS -- Performing Test HAS_WNO_MISSING_FIELD_INITIALIZERS - Success -- Performing Test HAS_WNO_TYPE_LIMITS -- Performing Test HAS_WNO_TYPE_LIMITS - Success -- Performing Test HAS_WNO_ARRAY_BOUNDS -- Performing Test HAS_WNO_ARRAY_BOUNDS - Success -- Performing Test HAS_WNO_UNKNOWN_PRAGMAS -- Performing Test HAS_WNO_UNKNOWN_PRAGMAS - Success -- Performing Test HAS_WNO_UNUSED_PARAMETER -- Performing Test HAS_WNO_UNUSED_PARAMETER - Success -- Performing Test HAS_WNO_UNUSED_FUNCTION -- Performing Test HAS_WNO_UNUSED_FUNCTION - Success -- Performing Test HAS_WNO_UNUSED_RESULT -- Performing Test HAS_WNO_UNUSED_RESULT - Success -- Performing Test HAS_WNO_STRICT_OVERFLOW -- Performing Test HAS_WNO_STRICT_OVERFLOW - Success -- Performing Test HAS_WNO_STRICT_ALIASING -- Performing Test HAS_WNO_STRICT_ALIASING - Success -- Performing Test HAS_WNO_STRINGOP_OVERFLOW -- Performing Test HAS_WNO_STRINGOP_OVERFLOW - Success -- Performing Test HAS_WVLA_EXTENSION -- Performing Test HAS_WVLA_EXTENSION - Failed -- Performing Test HAS_WSUGGEST_OVERRIDE -- Performing Test HAS_WSUGGEST_OVERRIDE - Success -- Performing Test HAS_WNEWLINE_EOF -- Performing Test HAS_WNEWLINE_EOF - Failed -- Performing Test HAS_WINCONSISTENT_MISSING_OVERRIDE -- Performing Test HAS_WINCONSISTENT_MISSING_OVERRIDE - Failed -- Performing Test HAS_WINCONSISTENT_MISSING_DESTRUCTOR_OVERRIDE -- Performing Test HAS_WINCONSISTENT_MISSING_DESTRUCTOR_OVERRIDE - Failed -- Performing Test HAS_WNO_ERROR_PEDANTIC -- Performing Test HAS_WNO_ERROR_PEDANTIC - Success -- Performing Test HAS_WNO_ERROR_OLD_STYLE_CAST -- Performing Test HAS_WNO_ERROR_OLD_STYLE_CAST - Success -- Performing Test HAS_WNO_ERROR_INCONSISTENT_MISSING_OVERRIDE -- Performing Test HAS_WNO_ERROR_INCONSISTENT_MISSING_OVERRIDE - Failed -- Performing Test HAS_WNO_ERROR_INCONSISTENT_MISSING_DESTRUCTOR_OVERRIDE -- Performing Test HAS_WNO_ERROR_INCONSISTENT_MISSING_DESTRUCTOR_OVERRIDE - Failed -- Performing Test HAS_WCONSTANT_CONVERSION -- Performing Test HAS_WCONSTANT_CONVERSION - Failed -- Performing Test HAS_WNO_INVALID_PARTIAL_SPECIALIZATION -- Performing Test HAS_WNO_INVALID_PARTIAL_SPECIALIZATION - Failed -- Performing Test HAS_WNO_ALIGNED_ALLOCATION_UNAVAILABLE -- Performing Test HAS_WNO_ALIGNED_ALLOCATION_UNAVAILABLE - Failed -- Performing Test HAS_WNO_MISSING_BRACES -- Performing Test HAS_WNO_MISSING_BRACES - Success -- Performing Test HAS_QUNUSED_ARGUMENTS -- Performing Test HAS_QUNUSED_ARGUMENTS - Failed -- Performing Test HAS_FDIAGNOSTICS_COLOR_ALWAYS -- Performing Test HAS_FDIAGNOSTICS_COLOR_ALWAYS - Success -- Performing Test HAS_FALIGNED_NEW -- Performing Test HAS_FALIGNED_NEW - Success -- Performing Test HAS_WNO_UNUSED_BUT_SET_VARIABLE -- Performing Test HAS_WNO_UNUSED_BUT_SET_VARIABLE - Success -- Performing Test HAS_WNO_MAYBE_UNINITIALIZED -- Performing Test HAS_WNO_MAYBE_UNINITIALIZED - Success -- Performing Test HAS_FSTANDALONE_DEBUG -- Performing Test HAS_FSTANDALONE_DEBUG - Failed -- Performing Test HAS_FNO_MATH_ERRNO -- Performing Test HAS_FNO_MATH_ERRNO - Success -- Performing Test HAS_FNO_TRAPPING_MATH -- Performing Test HAS_FNO_TRAPPING_MATH - Success -- Performing Test HAS_WERROR_FORMAT -- Performing Test HAS_WERROR_FORMAT - Success -- Performing Test HAS_WDEPRECATED -- Performing Test HAS_WDEPRECATED - Success -- NUMA paths: -- /usr/include -- /usr/lib64/libnuma.so -- Looking for backtrace -- Looking for backtrace - found -- backtrace facility detected in default set of libraries -- Found Backtrace: /usr/include -- headers outputs: -- sources outputs: -- declarations_yaml outputs: -- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT -- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT - Success -- Using ATen parallel backend: OMP Found sleef: /usr/lib64/libsleef.so AT_INSTALL_INCLUDE_DIR include/ATen/core core header install: /builddir/build/BUILD/pytorch/build/aten/src/ATen/core/TensorBody.h core header install: /builddir/build/BUILD/pytorch/build/aten/src/ATen/core/aten_interned_strings.h core header install: /builddir/build/BUILD/pytorch/build/aten/src/ATen/core/enum_tag.h disable test because ATEN_NO_TEST is set -- Performing Test HAS_WNO_DEPRECATED_COPY -- Performing Test HAS_WNO_DEPRECATED_COPY - Success -- _GLIBCXX_USE_CXX11_ABI=1 is already defined as a cmake variable -- Using lib/python3.11/site-packages as python relative installation path -- -- ******** Summary ******** -- General: -- CMake version : 3.27.7 -- CMake command : /usr/bin/cmake -- System : Linux -- C++ compiler : /usr/bin/g++ -- C++ compiler id : GNU -- C++ compiler version : 13.2.1 -- Using ccache if found : OFF -- CXX flags : -O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -D_GLIBCXX_USE_CXX11_ABI=1 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DTMP_LIBKINETO_NANOSECOND -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -- Build type : Release -- Compile definitions : ONNXIFI_ENABLE_EXT=1;ONNX_NAMESPACE=onnx;HAVE_MMAP=1;_FILE_OFFSET_BITS=64;HAVE_SHM_OPEN=1;HAVE_SHM_UNLINK=1;HAVE_MALLOC_USABLE_SIZE=1;USE_EXTERNAL_MZCRC;MINIZ_DISABLE_ZIP_READER_CRC32_CHECKS;FLASHATTENTION_DISABLE_ALIBI -- CMAKE_PREFIX_PATH : /usr/local/cuda-12.3;/usr/local/cuda-12.3;/usr/local/cuda-12.3 -- CMAKE_INSTALL_PREFIX : /usr -- USE_GOLD_LINKER : OFF -- -- TORCH_VERSION : 2.4.0 -- BUILD_CAFFE2 : OFF -- BUILD_CAFFE2_OPS : OFF -- BUILD_STATIC_RUNTIME_BENCHMARK: OFF -- BUILD_BINARY : OFF -- BUILD_CUSTOM_PROTOBUF : OFF -- Protobuf compiler : /usr/bin/protoc -- Protobuf includes : /usr/include -- Protobuf libraries : /usr/lib64/libprotobuf.so -- BUILD_DOCS : OFF -- BUILD_PYTHON : ON -- Python version : 3.11.8 -- Python executable : /usr/bin/python3 -- Pythonlibs version : 3.11.8 -- Python library : /usr/lib64/python3.11 -- Python includes : /usr/include/python3.11 -- Python site-packages: lib/python3.11/site-packages -- BUILD_SHARED_LIBS : ON -- CAFFE2_USE_MSVC_STATIC_RUNTIME : OFF -- BUILD_TEST : OFF -- BUILD_JNI : OFF -- BUILD_MOBILE_AUTOGRAD : OFF -- BUILD_LITE_INTERPRETER: OFF -- INTERN_BUILD_MOBILE : -- TRACING_BASED : OFF -- USE_BLAS : 1 -- BLAS : open -- BLAS_HAS_SBGEMM : -- USE_LAPACK : 1 -- LAPACK : open -- USE_ASAN : OFF -- USE_TSAN : OFF -- USE_CPP_CODE_COVERAGE : OFF -- USE_CUDA : ON -- Split CUDA : ON -- CUDA static link : OFF -- USE_CUDNN : ON -- USE_EXPERIMENTAL_CUDNN_V8_API: -- USE_CUSPARSELT : OFF -- CUDA version : 12.3 -- USE_FLASH_ATTENTION : ON -- USE_MEM_EFF_ATTENTION : ON -- cuDNN version : 8.9.7 -- CUDA root directory : /usr/local/cuda-12.3 -- CUDA library : /usr/local/cuda-12.3/lib64/stubs/libcuda.so -- cudart library : /usr/local/cuda-12.3/lib64/libcudart.so -- cublas library : /usr/local/cuda-12.3/lib64/libcublas.so -- cufft library : /usr/local/cuda-12.3/lib64/libcufft.so -- curand library : /usr/local/cuda-12.3/lib64/libcurand.so -- cusparse library : /usr/local/cuda-12.3/lib64/libcusparse.so -- cuDNN library : /usr/lib64/libcudnn.so -- nvrtc : /usr/local/cuda-12.3/lib64/libnvrtc.so -- CUDA include path : /usr/local/cuda-12.3/include -- NVCC executable : /usr/local/cuda-12.3/bin/nvcc -- CUDA compiler : /usr/local/cuda-12.3/bin/nvcc -- CUDA flags : --compiler-options -fPIC -Wno-deprecated-gpu-targets -allow-unsupported-compiler --fatbin-options -compress-all -DLIBCUDACXX_ENABLE_SIMPLIFIED_COMPLEX_OPERATIONS -D_GLIBCXX_USE_CXX11_ABI=1 -Xfatbin -compress-all --compiler-options -fPIC -Wno-deprecated-gpu-targets -allow-unsupported-compiler --fatbin-options -compress-all -DONNX_NAMESPACE=onnx -gencode arch=compute_52,code=sm_52 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_52,code=compute_52 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -DCUDA_HAS_FP16 -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUB_WRAPPED_NAMESPACE=at_cuda_detail -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -- CUDA host compiler : /usr/bin/cuda-g++ -- CUDA --device-c : OFF -- USE_TENSORRT : OFF -- USE_XPU : OFF -- USE_ROCM : OFF -- BUILD_NVFUSER : -- USE_EIGEN_FOR_BLAS : -- USE_FBGEMM : ON -- USE_FAKELOWP : OFF -- USE_KINETO : ON -- USE_FFMPEG : OFF -- USE_GFLAGS : ON -- USE_GLOG : ON -- USE_LEVELDB : ON -- LevelDB version : 1.23 -- Snappy version : 1.1.9 -- USE_LITE_PROTO : OFF -- USE_LMDB : ON -- LMDB version : 0.9.32 -- USE_METAL : OFF -- USE_PYTORCH_METAL : OFF -- USE_PYTORCH_METAL_EXPORT : OFF -- USE_MPS : OFF -- USE_MKL : -- USE_MKLDNN : OFF -- USE_UCC : OFF -- USE_ITT : ON -- USE_NCCL : ON -- USE_SYSTEM_NCCL : ON -- USE_NNPACK : ON -- USE_NUMPY : ON -- USE_OBSERVERS : ON -- USE_OPENCL : OFF -- USE_OPENCV : ON -- OpenCV version : 4.9.0 -- USE_OPENMP : ON -- USE_TBB : OFF -- USE_MIMALLOC : OFF -- USE_VULKAN : OFF -- USE_PROF : OFF -- USE_QNNPACK : ON -- USE_PYTORCH_QNNPACK : ON -- USE_XNNPACK : ON -- USE_REDIS : ON -- USE_ROCKSDB : ON -- USE_ZMQ : ON -- USE_DISTRIBUTED : ON -- USE_MPI : OFF -- USE_GLOO : ON -- USE_GLOO_WITH_OPENSSL : OFF -- USE_TENSORPIPE : ON -- Public Dependencies : -- Private Dependencies : Threads::Threads;/usr/lib64/libopenblaso.so;pthreadpool;cpuinfo;qnnpack;pytorch_qnnpack;XNNPACK;fbgemm;/usr/lib64/liblmdb.so;/usr/lib64/libleveldb.so;/usr/lib64/libsnappy.so;/usr/lib64/libzmq.so;/usr/lib64/libhiredis.so;opencv_core;opencv_highgui;opencv_imgproc;opencv_imgcodecs;opencv_optflow;opencv_videoio;opencv_video;ittnotify;caffe2::openmp;tensorpipe;gloo;onnx_proto;onnx;onnx_optimizer;foxi_loader;rt;fmt::fmt-header-only;kineto;gcc_s;gcc;dl -- Public CUDA Deps. : caffe2::cuda;caffe2::nvrtc -- Private CUDA Deps. : caffe2::curand;caffe2::cufft;caffe2::cublas;torch::cudnn;__caffe2_nccl;tensorpipe_cuda;gloo_cuda;/usr/local/cuda-12.3/lib64/libcudart.so;CUDA::cusparse;CUDA::cufft;ATEN_CUDA_FILES_GEN_LIB -- USE_COREML_DELEGATE : OFF -- BUILD_LAZY_TS_BACKEND : ON -- USE_ROCM_KERNEL_ASSERT : OFF -- Performing Test HAS_WMISSING_PROTOTYPES -- Performing Test HAS_WMISSING_PROTOTYPES - Failed -- Performing Test HAS_WERROR_MISSING_PROTOTYPES -- Performing Test HAS_WERROR_MISSING_PROTOTYPES - Failed -- Configuring done (21.7s) CMake Warning at torch/CMakeLists.txt:282 (target_link_libraries): Target "_C" requests linking to directory "/usr/lib64/python3.11". Targets may link only to libraries. CMake is dropping the item. -- Generating done (0.8s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP INCLUDE_INSTALL_DIR LIB_INSTALL_DIR LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR USE_BREAKPAD USE_FAST_NVCC -- Build files have been written to: /builddir/build/BUILD/pytorch/build + make -j4 [ 0%] Building C object confu-deps/clog/CMakeFiles/clog.dir/src/clog.c.o [ 0%] Linking C static library ../../lib/libfxdiv.a [ 0%] Linking C static library ../../lib/libfp16.a [ 0%] Linking C static library ../../lib/libpsimd.a [ 0%] Built target fxdiv [ 0%] Built target psimd [ 0%] Built target fp16 [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/packing.dir/src/packing.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/logging.dir/src/enums/datatype-strings.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/normalization.dir/src/normalization.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/logging.dir/src/enums/microkernel-type.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/logging.dir/src/enums/node-type.c.o [ 0%] Linking C static library ../../lib/libclog.a [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/logging.dir/src/enums/operator-type.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/logging.dir/src/log.c.o [ 0%] Built target logging [ 0%] Built target normalization [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/allocator.dir/src/allocator.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/memory.dir/src/memory.c.o [ 0%] Built target allocator [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernel-utils.dir/src/microkernel-utils.c.o [ 0%] Built target microkernel-utils [ 0%] Built target memory [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/mutex.dir/src/mutex.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/post-operation.dir/src/operators/post-operation.c.o [ 0%] Built target mutex [ 0%] Built target post-operation [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/operator-utils.dir/src/operator-utils.c.o [ 0%] Building C object confu-deps/XNNPACK/CMakeFiles/operator-run.dir/src/operator-run.c.o [ 0%] Built target operator-utils [ 0%] Building CXX object confu-deps/XNNPACK/CMakeFiles/convolution-test-helpers.dir/test/convolution-test-helpers.cc.o [ 0%] Built target convolution-test-helpers [ 0%] Building C object third_party/ittapi/CMakeFiles/ittnotify.dir/src/ittnotify/ittnotify_static.c.o [ 0%] Built target operator-run [ 0%] Building CXX object third_party/fmt/CMakeFiles/fmt.dir/src/format.cc.o [ 0%] Built target clog [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/Allocator.cpp.o [ 0%] Building C object third_party/ittapi/CMakeFiles/ittnotify.dir/src/ittnotify/jitprofiling.c.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/AutogradState.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/CPUAllocator.cpp.o [ 0%] Linking C static library ../../lib/libittnotify.a [ 0%] Built target ittnotify [ 0%] Running C++/Python protocol buffer compiler on /builddir/build/BUILD/pytorch/caffe2/proto/torch.proto [ 0%] Running C++/Python protocol buffer compiler on /builddir/build/BUILD/pytorch/caffe2/proto/caffe2.proto [ 0%] Building CXX object caffe2/proto/CMakeFiles/Caffe2_PROTO.dir/torch.pb.cc.o [ 0%] Built target packing [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/ConstantSymNodeImpl.cpp.o [ 0%] Building CXX object caffe2/CMakeFiles/caffe2_nvrtc.dir/__/aten/src/ATen/cuda/nvrtc_stub/ATenNVRTC.cpp.o [ 0%] Linking CXX shared library ../lib/libcaffe2_nvrtc.so Warning: Unused direct dependencies: libcuda.so.1 /lib64/libm.so.6 /lib64/libgcc_s.so.1 [ 0%] Built target caffe2_nvrtc [ 0%] Generating ATen headers [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/CopyBytes.cpp.o [ 0%] Building CXX object caffe2/proto/CMakeFiles/Caffe2_PROTO.dir/caffe2.pb.cc.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/DefaultDtype.cpp.o [ 0%] Building CXX object third_party/fmt/CMakeFiles/fmt.dir/src/os.cc.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/Device.cpp.o [ 0%] Linking CXX static library ../../lib/libfmt.a [ 0%] Built target fmt [ 0%] Generating ATen sources [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/DeviceType.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/DispatchKey.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/DispatchKeySet.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/GeneratorImpl.cpp.o [ 0%] Built target Caffe2_PROTO [ 0%] Generating ATen headers [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/GradMode.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/InferenceMode.cpp.o [ 0%] Generating ATen sources [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/RefcountedDeleter.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/SafePyObject.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/Scalar.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/ScalarType.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/Storage.cpp.o [ 0%] Generating ATen declarations_yaml [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/StorageImpl.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/Stream.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/SymBool.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/SymFloat.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/SymInt.cpp.o [ 0%] Building CXX object c10/CMakeFiles/c10.dir/core/SymIntArrayRef.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/SymNodeImpl.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/SymbolicShapeMeta.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/TensorImpl.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/TensorOptions.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/UndefinedTensorImpl.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/WrapDimMinimal.cpp.o [ 1%] Building C object caffe2/CMakeFiles/torch_global_deps.dir/__/torch/csrc/empty.c.o [ 1%] Linking C shared library ../lib/libtorch_global_deps.so Warning: Unused direct dependencies: /lib64/libstdc++.so.6 /usr/local/cuda-12.3/lib64/libnvrtc.so.12 libcuda.so.1 /usr/local/cuda-12.3/lib64/libcudart.so.12 /usr/local/cuda-12.3/lib64/libnvToolsExt.so.1 [ 1%] Built target torch_global_deps [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/COW.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/COWDeleter.cpp.o [ 1%] Built target python_copy_files [ 1%] Generating /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/Functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/ViewFuncs.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/VariableType_1.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/VariableType_3.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/VariableType_4.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/TraceType_1.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/TraceType_3.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/TraceType_4.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_0.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp, /builddir/build/BUILD/pytorch/torch/csrc/inductor/aoti_torch/generated/c_shim_cpu.cpp, /builddir/build/BUILD/pytorch/torch/csrc/lazy/generated/LazyNativeFunctions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/lazy/generated/RegisterAutogradLazy.cpp, /builddir/build/BUILD/pytorch/torch/csrc/lazy/generated/RegisterLazy.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/Functions.h, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/variable_factories.h, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/ViewFuncs.h, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/VariableType.h, /builddir/build/BUILD/pytorch/torch/csrc/lazy/generated/LazyIr.h, /builddir/build/BUILD/pytorch/torch/csrc/lazy/generated/LazyNonNativeIr.h, /builddir/build/BUILD/pytorch/torch/csrc/lazy/generated/LazyNativeFunctions.h, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_functions_0.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_functions_1.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_functions_2.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_functions_3.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_functions_4.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_torch_functions_0.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_torch_functions_1.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_torch_functions_2.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_nn_functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_fft_functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_linalg_functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_nested_functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_sparse_functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_special_functions.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_return_types.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_enum_tag.cpp, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_functions.h, /builddir/build/BUILD/pytorch/torch/csrc/autograd/generated/python_return_types.h, /builddir/build/BUILD/pytorch/torch/testing/_internal/generated/annotated_fn_args.py, /builddir/build/BUILD/pytorch/torch/csrc/inductor/aoti_torch/generated/c_shim_cuda.cpp [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/DeviceGuardImplInterface.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/GPUTrace.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/HermeticPyObjectTLS.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/LocalDispatchKeySet.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/PyInterpreter.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/PyObjectSlot.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/PythonDispatcherTLS.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/SizesAndStrides.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/TorchDispatchModeTLS.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/impl/alloc_cpu.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/core/thread_pool.cpp.o [ 1%] Built target generate-torch-sources [ 1%] Generating /builddir/build/BUILD/pytorch/torch/_C/__init__.pyi, /builddir/build/BUILD/pytorch/torch/_C/_VariableFunctions.pyi, /builddir/build/BUILD/pytorch/torch/nn/functional.pyi [ 1%] Building CXX object c10/CMakeFiles/c10.dir/mobile/CPUCachingAllocator.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/mobile/CPUProfilingAllocator.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/ApproximateClock.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Backtrace.cpp.o [ 1%] Generating /builddir/build/BUILD/pytorch/torch/utils/data/datapipes/datapipe.pyi [ 1%] Built target torch_python_stubs [ 1%] Generating /builddir/build/BUILD/pytorch/torch/version.py [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Bfloat16.cpp.o [ 1%] Built target gen_torch_version [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/init.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/add.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/average-pooling.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/C++17.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/channel-shuffle.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/clamp.c.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/conv-prepack.cc.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/DeadlockDetection.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/convolution.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/deconvolution.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Exception.cpp.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/fc-prepack.cc.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/fully-connected.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/fully-connected-sparse.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/global-average-pooling.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/hardsigmoid.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/hardswish.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/leaky-relu.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/max-pooling.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/sigmoid.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/softargmax.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/tanh.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/operator-delete.c.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/conv-run.cc.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Float8_e4m3fn.cpp.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/deconv-run.cc.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Float8_e4m3fnuz.cpp.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/fc-run.cc.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/fc-unpack.cc.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Float8_e5m2.cpp.o [ 1%] Building CXX object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/fc-dynamic-run.cc.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/indirection.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Float8_e5m2fnuz.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/operator-run.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Half.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/u8lut32norm/scalar.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/x8lut/scalar.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/sgemm/6x8-psimd.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8avgpool/mp8x9p8q-sse2.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8avgpool/up8x9-sse2.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/LeftRight.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8avgpool/up8xm-sse2.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8conv/4x4c2-sse2.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Logging.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8dwconv/mp8x25-sse2.c.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8dwconv/mp8x25-sse2-per-channel.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/MathConstants.cpp.o [ 1%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8dwconv/mp8x27-sse2.c.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Metaprogramming.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/Optional.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/ParallelGuard.cpp.o [ 1%] Building CXX object c10/CMakeFiles/c10.dir/util/SmallVector.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8dwconv/up8x9-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8dwconv/up8x9-sse2-per-channel.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/StringUtil.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gavgpool/mp8x7p7q-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gavgpool/up8x7-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gavgpool/up8xm-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm/2x4c8-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/ThreadLocalDebugInfo.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm/4x4c2-dq-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/TypeCast.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm/4x4c2-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/TypeList.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm_sparse/8x4c1x4-dq-packedA-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/TypeTraits.cpp.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/Type_demangle.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm_sparse/8x4-packA-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/Type_no_demangle.cpp.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/Unicode.cpp.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/UniqueVoidPtr.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8vadd/sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/complex_math.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/u8clamp/sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/u8maxpool/16x9p8q-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/flags_use_gflags.cpp.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/u8maxpool/sub16-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/u8rmax/sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/x8zip/x2-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/x8zip/x3-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/x8zip/x4-sse2.c.o [ 2%] Building C object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/x8zip/xm-sse2.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/flags_use_no_gflags.cpp.o [ 2%] Linking CXX static library ../../lib/libpytorch_qnnpack.a [ 2%] Built target pytorch_qnnpack [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-bfly4/cs16-bfly4-samples1-scalar.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/int128.cpp.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-bfly4/cs16-bfly4-samples4-scalar.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-bfly4/gen/cs16-bfly4-scalar-x1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-bfly4/gen/cs16-bfly4-scalar-x2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-bfly4/gen/cs16-bfly4-scalar-x4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-fftr/gen/cs16-fftr-scalar-x1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-fftr/gen/cs16-fftr-scalar-x2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-fftr/gen/cs16-fftr-scalar-x4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-vsquareabs/gen/cs16-vsquareabs-scalar-x1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-vsquareabs/gen/cs16-vsquareabs-scalar-x2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-vsquareabs/gen/cs16-vsquareabs-scalar-x3.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/cs16-vsquareabs/gen/cs16-vsquareabs-scalar-x4.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/intrusive_ptr.cpp.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-scalar-u1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-scalar-u2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-scalar-u3.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-scalar-u4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-fmagic-u1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-fmagic-u2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-fmagic-u3.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-fmagic-u4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-imagic-u1.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/numa.cpp.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-imagic-u2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-imagic-u3.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-qs8-vcvt/gen/f16-qs8-vcvt-scalar-imagic-u4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmax-scalar-u1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmax-scalar-u2-acc2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmax-scalar-u3-acc3.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmax-scalar-u4-acc2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmax-scalar-u4-acc4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmin-scalar-u1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmin-scalar-u2-acc2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmin-scalar-u3-acc3.c.o [ 2%] Building CXX object c10/CMakeFiles/c10.dir/util/signal_handler.cpp.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmin-scalar-u4-acc2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rmin-scalar-u4-acc4.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rminmax-scalar-u1.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rminmax-scalar-u2-acc2.c.o [ 2%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rminmax-scalar-u3-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rminmax-scalar-u4-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/gen/f16-rminmax-scalar-u4-acc4.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-argmaxpool/f32-argmaxpool-4x-scalar-c1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-argmaxpool/f32-argmaxpool-9p8x-scalar-c1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-argmaxpool/f32-argmaxpool-9x-scalar-c1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-avgpool/f32-avgpool-9p8x-minmax-scalar-c1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-avgpool/f32-avgpool-9x-minmax-scalar-c1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-conv-hwc2chw/f32-conv-hwc2chw-3x3s2p1c3x4-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-conv-hwc/f32-conv-hwc-3x3s2p0p1c3x4-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-conv-hwc/f32-conv-hwc-3x3s2p1c3x4-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-1x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-1x1-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-1x1-acc4.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-2x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-2x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-3x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-4x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-5x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-scalar-6x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-1x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-1x1-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-1x1-acc4.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-2x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-2x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-3x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-scalar-4x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-1x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-1x1-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-1x1-acc4.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-1x1-acc5.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-2x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-2x1-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-2x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-3x1-acc2.c.o [ 3%] Building CXX object c10/CMakeFiles/c10.dir/util/tempfile.cpp.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-scalar-3x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-1x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-1x1-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-1x1-acc4.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-1x1-acc5.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-1x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-2x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-2x1-acc3.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-2x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-3x1-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-scalar-3x1.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l1c1s1r-minmax-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l1c1s1r-minmax-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l1c1s1r-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l1c1s1r-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l4c1s1r-minmax-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l4c1s1r-minmax-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l4c1s1r-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-2f2m2l4c1s1r-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3f3m3l1c1s1r-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3f3m3l1c1s1r-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p1c-minmax-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p1c-minmax-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p1c-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p1c-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p2c-minmax-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p2c-minmax-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p2c-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p2c-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p1c-minmax-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p1c-minmax-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p1c-scalar-acc2.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p1c-scalar.c.o [ 3%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p2c-minmax-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p2c-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p2c-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p2c-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l1c1s1r-minmax-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l1c1s1r-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l1c1s1r-scalar-acc2.c.o [ 4%] Building CXX object c10/CMakeFiles/c10.dir/util/thread_name.cpp.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l1c1s1r-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l1c1s1r-minmax-scalar-acc2.c.o [ 4%] Building CXX object c10/CMakeFiles/c10.dir/util/typeid.cpp.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l1c1s1r-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l1c1s1r-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l1c1s1r-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l1c1s1r-minmax-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l1c1s1r-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l1c1s1r-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l1c1s1r-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p1c-minmax-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p1c-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p1c-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p1c-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p2c-minmax-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p2c-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p2c-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p2c-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p1c-minmax-scalar-acc2.c.o [ 4%] Linking CXX shared library ../lib/libc10.so [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p1c-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p1c-scalar-acc2.c.o [ 4%] Built target c10 [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p1c-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p2c-minmax-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p2c-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p2c-scalar-acc2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p2c-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-bitcast-u1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-bitcast-u2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-bitcast-u3.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-bitcast-u4.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-fabsf-u1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-fabsf-u2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-fabsf-u3.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-scalar-fabsf-u4.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gavgpool-cw/f32-gavgpool-cw-scalar-u1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gavgpool/f32-gavgpool-7p7x-minmax-scalar-c1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gavgpool/f32-gavgpool-7x-minmax-scalar-c1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x4-relu-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x4-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-2x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-2x4-relu-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-2x4-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x2-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x2-relu-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x2-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x4-relu-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x4-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-2x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear-chw/gen/f32-ibilinear-chw-scalar-p1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear-chw/gen/f32-ibilinear-chw-scalar-p2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear-chw/gen/f32-ibilinear-chw-scalar-p4.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear/gen/f32-ibilinear-scalar-c1.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear/gen/f32-ibilinear-scalar-c2.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear/gen/f32-ibilinear-scalar-c4.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x4-relu-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x4-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-2x4-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-2x4-relu-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-2x4-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x2-minmax-scalar.c.o [ 4%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x2-relu-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x2-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x4-relu-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x4-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-maxpool/f32-maxpool-9p8x-minmax-scalar-c1.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-pavgpool/f32-pavgpool-9p8x-minmax-scalar-c1.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-pavgpool/f32-pavgpool-9x-minmax-scalar-c1.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ppmm/gen/f32-ppmm-2x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ppmm/gen/f32-ppmm-3x3-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ppmm/gen/f32-ppmm-4x2-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ppmm/gen/f32-ppmm-4x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-scalar-2x1.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-scalar-2x4.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-1x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-2x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x2-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x4-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x4-relu-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x4-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x4-minmax-scalar.c.o [ 5%] Built target ATEN_CPU_FILES_GEN_TARGET [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/hardware-config.dir/src/configs/hardware-config.c.o [ 5%] Built target ATEN_CUDA_FILES_GEN_TARGET [ 5%] Built target hardware-config [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x4-relu-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/indirection.dir/src/indirection.c.o [ 5%] Building CXX object confu-deps/XNNPACK/CMakeFiles/jit.dir/src/jit/aarch32-assembler.cc.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x4-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x2-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x2-relu-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x2-scalar.c.o [ 5%] Building CXX object confu-deps/XNNPACK/CMakeFiles/jit.dir/src/jit/aarch64-assembler.cc.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x4-minmax-scalar.c.o [ 5%] Built target indirection [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microparams-init.dir/src/microparams-init.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x4-relu-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x4-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-spmm/gen/f32-qc8w-spmm-1x1-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-spmm/gen/f32-qc8w-spmm-2x1-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-spmm/gen/f32-qc8w-spmm-4x1-minmax-scalar.c.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-spmm/gen/f32-qc8w-spmm-8x1-minmax-scalar.c.o [ 5%] Building CXX object confu-deps/XNNPACK/CMakeFiles/jit.dir/src/jit/assembler.cc.o [ 5%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-spmm/gen/f32-qc8w-spmm-8x2-minmax-scalar.c.o [ 5%] Built target jit [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/cache.dir/src/cache.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/sse.c.o [ 6%] Built target cache [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operator-delete.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-spmm/gen/f32-qc8w-spmm-8x4-minmax-scalar.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/argmax-pooling-nhwc.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/average-pooling-nhwc.c.o [ 6%] Built target microparams-init [ 6%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAAllocatorConfig.cpp.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-fmagic-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-fmagic-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-fmagic-u3.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/batch-matrix-multiply-nc.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-fmagic-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-imagic-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-imagic-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/binary-elementwise-nd.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-imagic-u3.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-imagic-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-lrintf-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-lrintf-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-lrintf-u3.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-scalar-lrintf-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-fmagic-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-fmagic-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/channel-shuffle-nc.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-fmagic-u3.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-fmagic-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/constant-pad-nd.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-imagic-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-imagic-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-imagic-u3.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/convolution-nchw.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-imagic-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-lrintf-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-lrintf-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-lrintf-u3.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-scalar-lrintf-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/convolution-nhwc.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-lut64-p2-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-lut64-p2-u2-acc2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-lut64-p2-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-lut64-p2-u4-acc2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-lut64-p2-u4-acc4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-lut64-p2-u4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-p5-u1.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-p5-u2-acc2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-p5-u2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/deconvolution-nhwc.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-p5-u4-acc2.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-p5-u4-acc4.c.o [ 6%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/sse2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-scalar-rr2-p5-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-scalar-u2-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-scalar-u3-acc3.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-scalar-u4-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/dynamic-fully-connected-nc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-scalar-u4-acc4.c.o [ 7%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-scalar-u2-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-scalar-u3-acc3.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/fully-connected-nc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-scalar-u4-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-scalar-u4-acc4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-scalar-u2-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-scalar-u3-acc3.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-scalar-u4-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-scalar-u4-acc4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/global-average-pooling-ncw.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-scalar-u2-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-scalar-u3-acc3.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/global-average-pooling-nwc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-scalar-u4-acc2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-scalar-u4-acc4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-1x1-minmax-scalar-pipelined.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-1x1-minmax-scalar.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/lut-elementwise-nc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-2x1-minmax-scalar-pipelined.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-2x1-minmax-scalar.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-4x1-minmax-scalar-pipelined.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/max-pooling-nhwc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-4x1-minmax-scalar.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-8x1-minmax-scalar-pipelined.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/prelu-nc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-8x1-minmax-scalar.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/reduce-nd.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-8x2-minmax-scalar.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/resize-bilinear-nchw.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-8x4-minmax-scalar.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/resize-bilinear-nhwc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/rope-nthc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/scaled-dot-product-attention-nhtc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/slice-nd.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-relu-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-relu-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/softmax-nc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-relu-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-relu-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/transpose-nd.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/unary-elementwise-nc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-relu-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-relu-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-relu-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-relu-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/operators.dir/src/operators/unpooling-nhwc.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-scalar-u2.c.o [ 7%] Built target operators [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-scalar-u4.c.o [ 7%] Linking CXX static library ../lib/libcaffe2_protos.a [ 7%] Built target caffe2_protos [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/memory-planner.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-relu-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-relu-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/runtime.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-relu-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-relu-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/ssse3.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-scalar-u8.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-scalar-u1.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-scalar-u2.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-scalar-u4.c.o [ 7%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-relu-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/abs.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-relu-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/add2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-relu-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-relu-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/argmax-pooling-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/average-pooling-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/sse41.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/bankers-rounding.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/batch-matrix-multiply.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/ceiling.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/clamp.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/concatenate.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/convert.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/convolution-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/copy.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/deconvolution-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/depth-to-space-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/depthwise-convolution-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-relu-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-relu-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-relu-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/divide.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-relu-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/elu.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/even-split.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/floor.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/fully-connected-sparse.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/fully-connected.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-relu-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-relu-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-relu-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/global-average-pooling.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-relu-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/global-sum-pooling.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/hardswish.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/leaky-relu.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/max-pooling-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-relu-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-relu-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/maximum2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-relu-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/minimum2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-relu-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/multiply2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/negate.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/prelu.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/reshape-helpers.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/scaled-dot-product-attention.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-relu-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-relu-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/sigmoid.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-relu-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/softmax.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-relu-scalar-u8.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-scalar-u1.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/space-to-depth-2d.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-scalar-u2.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-scalar-u4.c.o [ 8%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/square-root.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-scalar-u8.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/square.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-scalar-u1.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-scalar-u2.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/squared-difference.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-scalar-u4.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-scalar-u8.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/static-constant-pad.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-scalar-u1.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-scalar-u2.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/static-mean.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-scalar-u4.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-scalar-u8.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/static-reshape.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-scalar-u1.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-scalar-u2.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/static-resize-bilinear-2d.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-scalar-u4.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/static-slice.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-scalar-u8.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-relu-scalar-u1.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/static-transpose.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-relu-scalar-u2.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-relu-scalar-u4.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/subtract.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-relu-scalar-u8.c.o [ 9%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/tanh.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/unpooling-2d.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-scalar-u8.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/subgraph/validation.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/subgraph.dir/src/tensor.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-scalar-u8.c.o [ 10%] Built target subgraph [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-relu-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-relu-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-relu-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-relu-scalar-u8.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-scalar-u8.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-scalar-u8.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/f16c.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-lut16-p3-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-lut16-p3-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-lut16-p3-u3.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-lut16-p3-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-lut16-p3-u5.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-lut16-p3-u6.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/xop.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-p6-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-p6-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-p6-u3.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-p6-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-p6-u5.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-scalar-rr2-p6-u6.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/fma3.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vmulcaddc/gen/f32-vmulcaddc-c1-minmax-scalar-2x.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vmulcaddc/gen/f32-vmulcaddc-c2-minmax-scalar-2x.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vmulcaddc/gen/f32-vmulcaddc-c4-minmax-scalar-2x.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-scalar-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-scalar-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-scalar-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-scalar-u8.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-scalar-libm-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-scalar-libm-u2.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-scalar-libm-u4.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-scalar-libm-u1.c.o [ 10%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-scalar-libm-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-scalar-libm-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-scalar-libm-u1.c.o [ 11%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDADeviceAssertionHost.cpp.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-scalar-libm-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-scalar-libm-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-scalar-libm-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-scalar-libm-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-scalar-libm-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx512f.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-scalar-rsqrt-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-scalar-rsqrt-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-scalar-rsqrt-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-lut64-p2-div-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-lut64-p2-div-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-lut64-p2-div-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-lut2048-p1-div-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-lut2048-p1-div-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-lut2048-p1-div-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-p5-div-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-p5-div-u2.c.o [ 11%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAException.cpp.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-scalar-rr2-p5-div-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-scalar-sqrt-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-scalar-sqrt-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-scalar-sqrt-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-scalar-expm1minus-rr1-lut8-p4h3ts-div-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-scalar-expm1minus-rr1-lut8-p4h3ts-div-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-scalar-expm1minus-rr1-lut8-p4h3ts-div-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx512skx.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-scalar-expm1minus-rr1-p6h5ts-div-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-scalar-expm1minus-rr1-p6h5ts-div-u2.c.o [ 11%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAFunctions.cpp.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-scalar-expm1minus-rr1-p6h5ts-div-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-scalar-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-scalar-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-scalar-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-scalar-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-scalar-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-scalar-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-scalar-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-scalar-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-scalar-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/i16-vlshift/gen/i16-vlshift-scalar-u1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/i16-vlshift/gen/i16-vlshift-scalar-u2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/i16-vlshift/gen/i16-vlshift-scalar-u3.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/i16-vlshift/gen/i16-vlshift-scalar-u4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-lut4-p4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-lut8-p3.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-lut8-p4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-lut16-p3.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-lut16-p4.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-p5.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-scalar-rr2-p6.c.o [ 11%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAMallocAsyncAllocator.cpp.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expminus-scalar-rr2-lut64-p2.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expminus-scalar-rr2-lut2048-p1.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expminus-scalar-rr2-p5.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-f16-cvt-scalar-bitcast.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-f16-cvt-scalar-fabsf.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundd-scalar-addsub.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundd-scalar-cvt.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundd-scalar-floor.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundne-scalar-addsub.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundne-scalar-nearbyint.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundne-scalar-rint.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundu-scalar-addsub.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundu-scalar-ceil.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundu-scalar-cvt.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundz-scalar-addsub.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx512vbmi.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundz-scalar-cvt.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundz-scalar-trunc.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-scalar-rr2-lut64-p2-div.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-scalar-rr2-lut2048-p1-div.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-scalar-rr2-p5-div.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut4-p4h2ts-div.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut4-p4h2ts-rcp.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx512vnni.c.o [ 11%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut4-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut4-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p4h2ts-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p4h3ps-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut8-p4h3ts-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut16-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut16-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut16-p4h2ts-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut16-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut16-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx512vnnigfni.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut32-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-lut64-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-p6h4ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-p6h5ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-p6h5ps-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-p6h5ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr1-p6h5ts-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut4-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut4-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut4-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/amalgam/gen/avx512amx.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p4h2ts-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p4h3ps-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut8-p4h3ts-rcp.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut16-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2-k-over-64.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut16-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2-k-over-2048.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2minus-k-over-4.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2minus-k-over-8.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2minus-k-over-16.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut16-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2minus-k-over-32.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2minus-k-over-64.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/exp2minus-k-over-2048.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-prod.dir/src/tables/vlog.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut16-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut32-p3h1ts-div.c.o [ 12%] Built target microkernels-prod [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/argmaxpool-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-lut64-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-p6h4ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/avgpool-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-p6h5ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1minus-rr2-p6h5ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut4-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut4-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/binary-elementwise-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut4-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut8-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut8-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut8-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut8-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/cmul-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut16-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut16-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut16-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/conv-hwc2chw-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/dwconv-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut16-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/dwconv2d-chw-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut32-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/experiments-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-lut64-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/gavgpool-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/gavgpool-cw-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-p6h4ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/gemm-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/ibilinear-chw-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-p6h5ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/ibilinear-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr1-p6h5ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut4-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/lut32norm-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut4-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/maxpool-config.c.o [ 12%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAMiscFunctions.cpp.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut4-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/pavgpool-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/prelu-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut8-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/raddstoreexpminusmax-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut8-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/reduce-config.c.o [ 12%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/rmax-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut8-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/spmm-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/transpose-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut8-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/unary-elementwise-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut16-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/unpool-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut16-p4h2ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/vmulcaddc-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut16-p4h3ps-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/xx-fill-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut16-p4h3ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/xx-pad-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut32-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/x8-lut-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-lut64-p3h1ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/configs/zip-config.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/init.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-p6h4ts-div.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/params.c.o [ 12%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-p6h5ps-div.c.o [ 12%] Linking CXX static library ../../lib/libXNNPACK.a [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-scalar-expm1plus-rr2-p6h5ts-div.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-bitmanip.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-clz-binsearch.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-clz-newton.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-cvti32-sqrt-lrint.c.o [ 13%] Built target XNNPACK [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-cvti64-sqrt-lrint.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-cvti64-sqrtf-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-cvtu32-sqrt-lrint.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-cvtu32-sqrtf-lrintf.c.o [ 13%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/AccumulateType.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-hashemian.c.o [ 13%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/CPUGeneratorImpl.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u32-sqrt-scalar-tflm.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u64-sqrt-scalar-cvtu32-sqrt-cvtsatu32f64.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u64-sqrt-scalar-cvtu32-sqrt-llrint.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/u64-sqrt-scalar-cvtu64-sqrt-llrint.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x1-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x2-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x2-minmax-scalar.c.o [ 13%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/CachedTensorUtils.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x2-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x8-minmax-scalar.c.o [ 13%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x2-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x8-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x2-minmax-scalar.c.o [ 13%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ConjugateFallback.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x8-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x2-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x8-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4-minmax-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDATest.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/driver_api.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Linking CXX shared library ../../lib/libc10_cuda.so [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-imagic.c.o Warning: Unused direct dependencies: libc10.so.2.4 /lib64/libgflags.so.2.2 /lib64/libglog.so.0 /lib64/libm.so.6 [ 13%] Built target c10_cuda [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p1c-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p1c-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p1c-minmax-fp32-scalar-lrintf.c.o [ 13%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Context.cpp.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p1c-minmax-rndnu-scalar.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p2c-minmax-fp32-scalar-fmagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p2c-minmax-fp32-scalar-imagic.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p2c-minmax-fp32-scalar-lrintf.c.o [ 13%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p2c-minmax-rndnu-scalar.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p4c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p4c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p4c-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p4c-minmax-rndnu-scalar.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p1c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p1c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p1c-minmax-fp32-scalar-lrintf.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DLConvertor.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p2c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p2c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p2c-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p4c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p4c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p4c-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-scalar-u1.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-scalar-u2.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-scalar-u3.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-scalar-u4.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-fmagic-c1.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-fmagic-c2.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DeviceAccelerator.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-fmagic-c4.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Dispatch.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-imagic-c1.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-imagic-c2.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-imagic-c4.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-lrintf-c1.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-lrintf-c2.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-scalar-lrintf-c4.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-fmagic-c1.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-fmagic-c2.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-fmagic-c4.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-imagic-c1.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-imagic-c2.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-imagic-c4.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-lrintf-c1.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DynamicLibrary.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-lrintf-c2.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/EmptyTensor.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-scalar-lrintf-c4.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p1c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p2c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p2c-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-4p2c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ExpandUtils.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/FuncTorchTLS.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/FunctionalInverses.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/FunctionalStorageImpl.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p1c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p1c-minmax-fp32-scalar-imagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p1c-minmax-fp32-scalar-lrintf.c.o [ 14%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/FunctionalTensorWrapper.cpp.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p2c-minmax-fp32-scalar-fmagic.c.o [ 14%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p2c-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p2c-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p4c-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p4c-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p4c-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p1c-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p1c-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p1c-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p2c-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p2c-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p2c-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p4c-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p4c-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p4c-minmax-fp32-scalar-lrintf.c.o [ 15%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/FunctionalizeFallbackKernel.cpp.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyBatchedFallback.cpp.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyBatchedTensorImpl.cpp.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x2-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x2-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x2-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4-minmax-fp32-scalar-fmagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4-minmax-fp32-scalar-imagic.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4-minmax-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-fp32-scalar-fmagic.c.o [ 15%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyBatchingRegistrations.cpp.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-fp32-scalar-lrintf.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-gemmlowp-scalar.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndna-scalar-signed64.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndna-scalar-unsigned32.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndna-scalar-unsigned64.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndnu-scalar.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-scalar-u1.c.o [ 15%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-scalar-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-scalar-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-scalar-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-scalar-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-scalar-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-scalar-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-scalar-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-scalar-u4.c.o [ 16%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyVmapMode.cpp.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-scalar-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-scalar-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-scalar-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-scalar-andxor-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-scalar-andxor-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-scalar-andxor-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-scalar-select-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-scalar-select-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-scalar-select-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-scalar-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-scalar-u2.c.o [ 16%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyVmapTransforms.cpp.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-scalar-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-scalar-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-scalar-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-scalar-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-scalar-u1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-scalar-u2.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-scalar-u4.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-avgpool/qu8-avgpool-9p8x-minmax-fp32-scalar-imagic-c1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-avgpool/qu8-avgpool-9x-minmax-fp32-scalar-imagic-c1.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/MapAllocator.cpp.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l1c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l2c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l4c1s1r-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p1c-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p1c-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p1c-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p1c-minmax-rndnu-scalar.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p2c-minmax-fp32-scalar-fmagic.c.o [ 16%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/MemoryOverlap.cpp.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p2c-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p2c-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p2c-minmax-rndnu-scalar.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p4c-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p4c-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p4c-minmax-fp32-scalar-lrintf.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p4c-minmax-rndnu-scalar.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p1c-minmax-fp32-scalar-fmagic.c.o [ 16%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/NamedTensorUtils.cpp.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p1c-minmax-fp32-scalar-imagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p1c-minmax-fp32-scalar-lrintf.c.o [ 16%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/NestedTensorImpl.cpp.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p2c-minmax-fp32-scalar-fmagic.c.o [ 16%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p2c-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p2c-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p4c-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p4c-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p4c-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-scalar-u1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-scalar-u2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-scalar-u3.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-scalar-u4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-fmagic-c1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-fmagic-c2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-fmagic-c4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-imagic-c1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-imagic-c2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-imagic-c4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-lrintf-c1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-lrintf-c2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-scalar-lrintf-c4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-fmagic-c1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-fmagic-c2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-fmagic-c4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-imagic-c1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-imagic-c2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-imagic-c4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-lrintf-c1.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-lrintf-c2.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-scalar-lrintf-c4.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x2-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x2-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x2-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x2-minmax-rndnu-scalar.c.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ParallelCommon.cpp.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x2-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x2-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x2-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x2-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x2-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x2-minmax-fp32-scalar-imagic.c.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ParallelNative.cpp.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ParallelNativeTBB.cpp.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ParallelOpenMP.cpp.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x2-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x2-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x2-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x2-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x2-minmax-fp32-scalar-lrintf.c.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ParallelThreadPoolNative.cpp.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x2-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4-minmax-fp32-scalar-fmagic.c.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/PythonTorchFunctionTLS.cpp.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x2-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x2-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x2-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x2-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4-minmax-fp32-scalar-lrintf.c.o [ 17%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/SavedTensorHooks.cpp.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x2-minmax-fp32-scalar-fmagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x2-minmax-fp32-scalar-imagic.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x2-minmax-fp32-scalar-lrintf.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x2-minmax-rndnu-scalar.c.o [ 17%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4-minmax-fp32-scalar-fmagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4-minmax-fp32-scalar-imagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4-minmax-fp32-scalar-lrintf.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4-minmax-rndnu-scalar.c.o [ 18%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ScalarOps.cpp.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x2-minmax-fp32-scalar-fmagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x2-minmax-fp32-scalar-imagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x2-minmax-fp32-scalar-lrintf.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x2-minmax-rndnu-scalar.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4-minmax-fp32-scalar-fmagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4-minmax-fp32-scalar-imagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4-minmax-fp32-scalar-lrintf.c.o [ 18%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/SequenceNumber.cpp.o [ 18%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/SparseCsrTensorImpl.cpp.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4-minmax-rndnu-scalar.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x2-minmax-fp32-scalar-fmagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x2-minmax-fp32-scalar-imagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x2-minmax-fp32-scalar-lrintf.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x2-minmax-rndnu-scalar.c.o [ 18%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/SparseTensorImpl.cpp.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4-minmax-fp32-scalar-fmagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4-minmax-fp32-scalar-imagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4-minmax-fp32-scalar-lrintf.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4-minmax-rndnu-scalar.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-fp32-scalar-fmagic.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-fp32-scalar-lrintf.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-gemmlowp-scalar.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-rndna-scalar-signed64.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-rndna-scalar-unsigned32.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-rndna-scalar-unsigned64.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-scalar-andxor-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-scalar-andxor-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-scalar-andxor-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-scalar-select-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-scalar-select-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-scalar-select-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-scalar-c1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-scalar-c2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-scalar-c4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-maxpool/s8-maxpool-9p8x-minmax-scalar-c1.c.o [ 18%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/StorageUtils.cpp.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-vclamp/s8-vclamp-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-rmaxabs/gen/s16-rmaxabs-scalar-x1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-rmaxabs/gen/s16-rmaxabs-scalar-x2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-rmaxabs/gen/s16-rmaxabs-scalar-x3.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-rmaxabs/gen/s16-rmaxabs-scalar-x4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-window/gen/s16-window-scalar-u1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-window/gen/s16-window-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-window/gen/s16-window-scalar-u3.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s16-window/gen/s16-window-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-scalar-c1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-scalar-c2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-scalar-c4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-lut32norm/u8-lut32norm-scalar.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-maxpool/u8-maxpool-9p8x-minmax-scalar-c1.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-rmax/u8-rmax-scalar-u2.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-vclamp/u8-vclamp-scalar-u4.c.o [ 18%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u32-filterbank-accumulate/gen/u32-filterbank-accumulate-scalar-x1.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u32-filterbank-subtract/u32-filterbank-subtract-scalar-x2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u32-vlog/gen/u32-vlog-scalar-x1.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u32-vlog/gen/u32-vlog-scalar-x2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u32-vlog/gen/u32-vlog-scalar-x3.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u32-vlog/gen/u32-vlog-scalar-x4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u64-u32-vsqrtshift/u64-u32-vsqrtshift-scalar-cvtu32-sqrt-cvtu32f64-u1.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-scalar-u1.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-scalar-u2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-scalar-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-scalar-u8.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-scalar-u16.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x2-gemm-goi-scalar-int-u2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x2-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x4-gemm-goi-scalar-int-u2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x4-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x8-gemm-goi-scalar-int-u2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x8-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x16-gemm-goi-scalar-int-u2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x16-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x32-gemm-goi-scalar-int-u2.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-packw/gen/x8-packw-x32-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-1x2-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-1x4-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-2x1-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-2x2-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-2x4-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-4x1-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-4x2-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-4x4-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-x2-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-x3-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-x4-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-xm-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-packw/gen/x16-packw-x8-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-packw/gen/x16-packw-x16-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-1x2-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-1x4-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-2x1-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-2x2-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-2x4-scalar-int.c.o [ 19%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/TensorGeometry.cpp.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-4x1-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-4x2-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-4x4-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-1x2-scalar.c.o [ 19%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/TensorIndexing.cpp.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-1x4-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-2x1-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-2x2-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-2x4-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-4x1-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-4x2-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/gen/x24-transposec-4x4-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-2c1s1r-gemm-scalar-float.c.o [ 19%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/TensorIterator.cpp.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-2c1s1r-gemm-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-2c2s1r-gemm-scalar-float.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-2c2s1r-gemm-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-4c1s1r-gemm-scalar-float.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-4c1s1r-gemm-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-4c4s1r-gemm-scalar-float.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packb/gen/x32-packb-4c4s1r-gemm-scalar-int.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x2-gemm-goi-scalar-float-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x2-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x3-gemm-goi-scalar-float-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x3-gemm-goi-scalar-int-u4.c.o [ 19%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/TensorMeta.cpp.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x4-gemm-goi-scalar-float-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x4-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-scalar-float-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-scalar-float-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-scalar-int-u4.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packx/x32-packx-2x-scalar.c.o [ 19%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packx/x32-packx-3x-scalar.c.o [ 19%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/TensorNames.cpp.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packx/x32-packx-4x-scalar.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-1x2-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-1x2-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-1x4-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-1x4-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-2x1-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-2x1-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-2x2-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-2x2-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-2x4-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-2x4-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x1-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x1-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x2-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x2-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-unpool/x32-unpool-scalar.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-2c1s1r-gemm-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-2c1s1r-gemm-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-2c2s1r-gemm-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-2c2s1r-gemm-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-4c1s1r-gemm-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-4c1s1r-gemm-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-4c4s1r-gemm-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zerob/gen/x32-zerob-4c4s1r-gemm-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-x2-scalar.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-x3-scalar.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-x4-scalar.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-xm-scalar.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-1x2-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-1x2-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x1-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x1-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x1-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x1-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x2-scalar-float.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x2-scalar-int.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/xx-copy/xx-copy-scalar-memcpy.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/xx-fill/xx-fill-scalar-u16.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/xx-pad/xx-pad-p4-scalar-u16.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/xx-transposev/xx-transposev-1x1-scalar-memcpy.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma-expm1minus-rr1-lut8-p4h3ts-div-u1.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma-expm1minus-rr1-lut8-p4h3ts-div-u2.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma-expm1minus-rr1-lut8-p4h3ts-div-u4.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma-expm1minus-rr1-p6h5ts-div-u1.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma-expm1minus-rr1-p6h5ts-div-u2.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma-expm1minus-rr1-p6h5ts-div-u4.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut4-p4h2ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut4-p4h2ts-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut4-p4h3ps-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut4-p4h3ps-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut4-p4h3ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut4-p4h3ts-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p3h1ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p4h2ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p4h2ts-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p4h3ps-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p4h3ps-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p4h3ts-div.c.o [ 20%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/TensorUtils.cpp.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut8-p4h3ts-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut16-p3h1ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut16-p4h2ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut16-p4h2ts-rcp.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut16-p4h3ps-div.c.o [ 20%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ThreadLocalPythonObjects.cpp.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut16-p4h3ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut32-p3h1ts-div.c.o [ 20%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-lut64-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-p6h4ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-p6h5ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-p6h5ps-rcp.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-p6h5ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr1-p6h5ts-rcp.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut4-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut4-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut4-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut8-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut8-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut8-p4h2ts-rcp.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut8-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut8-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut16-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut16-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut16-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut16-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut32-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-lut64-p3h1ts-div.c.o [ 21%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ThreadLocalState.cpp.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-p6h4ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-p6h5ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1minus-rr2-p6h5ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut4-p4h2ts-div.c.o [ 21%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Utils.cpp.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut4-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut4-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut8-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut8-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut8-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut8-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut16-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut16-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut16-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut16-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut32-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-lut64-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-p6h4ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-p6h5ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr1-p6h5ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut4-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut4-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut4-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut8-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut8-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut8-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut8-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut16-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut16-p4h2ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut16-p4h3ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut16-p4h3ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut32-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-lut64-p3h1ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-p6h4ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-p6h5ps-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma-expm1plus-rr2-p6h5ts-div.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-avgpool/f32-avgpool-9p8x-minmax-sse-c4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-avgpool/f32-avgpool-9x-minmax-sse-c4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-conv-hwc2chw/f32-conv-hwc2chw-3x3s2p1c3x4-sse-1x1.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-conv-hwc2chw/f32-conv-hwc2chw-3x3s2p1c3x4-sse-2x2.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-1x4-acc2.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-1x4-acc3.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-1x4-acc4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-1x4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-2x4-acc2.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-2x4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-3x4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-4x4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-5x4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-sse-6x4.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-1x4-acc2.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-1x4-acc3.c.o [ 21%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-1x4-acc4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-1x4.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-2x4-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-2x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-3x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3s2p1-minmax-sse-4x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-1x4-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-1x4-acc3.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-1x4-acc4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-1x4-acc5.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-1x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-2x4-acc2.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/VmapModeRegistrations.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-2x4-acc3.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-2x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-3x4-acc2.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ZeroTensorFallback.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-3x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-4x4-acc2.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/autocast_mode.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-4x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5p2-minmax-sse-5x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-1x4-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-1x4-acc3.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-1x4-acc4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-1x4-acc5.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-1x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-2x4-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-2x4-acc3.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-2x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-3x4-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-5x5s2p2-minmax-sse-3x4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p4c-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p4c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p8c-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p8c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p4c-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p4c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p8c-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p8c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l4c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l4c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l8c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l8c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l4c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l4c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l8c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l8c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l16c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l16c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l4c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l4c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l8c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l8c4s4r-minmax-sse.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/cpu/FlushDenormal.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l16c4s4r-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l16c4s4r-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p4c-minmax-sse-acc2.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/cpu/Utils.cpp.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/CPUGuardImpl.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p4c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p8c-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p8c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p4c-minmax-sse-acc2.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/CUDAHooksInterface.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p4c-minmax-sse.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/HIPHooksInterface.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p8c-minmax-sse-acc2.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p8c-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gavgpool-cw/f32-gavgpool-cw-sse-u4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gavgpool/f32-gavgpool-7p7x-minmax-sse-c4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gavgpool/f32-gavgpool-7x-minmax-sse-c4.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x8-minmax-sse-dup.c.o [ 22%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/IPUHooksInterface.cpp.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x8-minmax-sse-load1.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x8s4-minmax-sse.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x8-minmax-sse-dup.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x8-minmax-sse-load1.c.o [ 22%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x2c4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x8-minmax-sse-dup.c.o [ 23%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/MPSHooksInterface.cpp.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x2c4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x8-minmax-sse-load1.c.o [ 23%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/MTIAHooksInterface.cpp.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x8-minmax-sse-load1.c.o [ 23%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/MetaGuardImpl.cpp.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x8-minmax-sse-load1.c.o [ 23%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/ORTHooksInterface.cpp.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x8-minmax-sse-dup.c.o [ 23%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/PrivateUse1HooksInterface.cpp.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear-chw/gen/f32-ibilinear-chw-sse-p4.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear-chw/gen/f32-ibilinear-chw-sse-p8.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear/gen/f32-ibilinear-sse-c4.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ibilinear/gen/f32-ibilinear-sse-c8.c.o [ 23%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/XPUHooksInterface.cpp.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x2c4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x2c4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x8-minmax-sse-dup.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x8-minmax-sse-load1.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x8s4-minmax-sse.c.o [ 23%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-maxpool/f32-maxpool-9p8x-minmax-sse-c4.c.o [ 24%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/ADInterpreters.cpp.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-pavgpool/f32-pavgpool-9p8x-minmax-sse-c4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-pavgpool/f32-pavgpool-9x-minmax-sse-c4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-ppmm/gen/f32-ppmm-4x8-minmax-sse.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-sse-2x4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-sse-2x8.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-sse-u4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-sse-u8-acc2.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-sse-u12-acc3.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-sse-u16-acc2.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-sse-u16-acc4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-sse-u4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-sse-u8-acc2.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-sse-u12-acc3.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-sse-u16-acc2.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-sse-u16-acc4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-sse-u4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-sse-u8-acc2.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-sse-u12-acc3.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-sse-u16-acc2.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-sse-u16-acc4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-sse-u4.c.o [ 24%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-sse-u8-acc2.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-sse-u12-acc3.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-sse-u16-acc2.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-sse-u16-acc4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-4x1-minmax-sse.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-8x1-minmax-sse.c.o [ 25%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesActivation.cpp.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-16x1-minmax-sse.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-spmm/gen/f32-spmm-32x1-minmax-sse.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-sse-u12.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vcmul/gen/f32-vcmul-sse-u16.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vmulcaddc/gen/f32-vmulcaddc-c4-minmax-sse-2x.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vmulcaddc/gen/f32-vmulcaddc-c8-minmax-sse-2x.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-sse-rsqrt-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-sse-rsqrt-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-sse-rsqrt-u16.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-sse-sqrt-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-sse-sqrt-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-sse-sqrt-u16.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-sse-u4.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-sse-u8.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundd-sse-addsub.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundne-sse-addsub.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundu-sse-addsub.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundz-sse-addsub.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-sse-hh1mac.c.o [ 25%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-sse-nr1mac.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-sse-nr2mac.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packx/x32-packx-4x-sse.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/x32-transposec-4x4-sse.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int16-u8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int16-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int16-u24.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int16-u32.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int32-u8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int32-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int32-u24.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse2-int32-u32.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vunary/gen/f16-vabs-sse2-u8.c.o [ 26%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesBinaryOps.cpp.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vunary/gen/f16-vabs-sse2-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vunary/gen/f16-vneg-sse2-u8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vunary/gen/f16-vneg-sse2-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-argmaxpool/f32-argmaxpool-4x-sse2-c4.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-argmaxpool/f32-argmaxpool-9p8x-sse2-c4.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-argmaxpool/f32-argmaxpool-9x-sse2-c4.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse2-u8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse2-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse2-u24.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse2-u32.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-sse2-2x4.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-sse2-2x8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-1x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-3x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-5x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-6x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8-minmax-sse2-load1.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8s4-minmax-sse2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x8-minmax-sse2-load1.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x8s4-minmax-sse2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x2c4-minmax-sse2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8-minmax-sse2-load1.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8s4-minmax-sse2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8-minmax-sse2-load1.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8s4-minmax-sse2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8-minmax-sse2-dup.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8-minmax-sse2-load1.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8s4-minmax-sse2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse2-u8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse2-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse2-u24.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse2-u32.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-sse2-u8.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-sse2-u16.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-sse2-u24.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-sse2-u32.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u4.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u8-acc2.c.o [ 26%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u12-acc2.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u12-acc3.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u16-acc2.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u16-acc4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u20-acc2.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u20-acc5.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-sse2-rr2-p5-u20.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-lut16-p3-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-lut16-p3-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-lut16-p3-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-lut16-p3-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-lut16-p3-u20.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-lut16-p3-u24.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-p6-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-p6-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-p6-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-p6-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-p6-u20.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse2-rr2-p6-u24.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-sse2-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-sse2-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-sse2-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-sse2-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-sse2-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-sse2-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-sse2-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-sse2-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-sse2-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-sse2-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-lut64-p2-div-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-lut64-p2-div-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-lut64-p2-div-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-lut64-p2-div-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-lut64-p2-div-u20.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-lut64-p2-div-u24.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-p5-div-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-p5-div-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-p5-div-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-p5-div-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-p5-div-u20.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse2-rr2-p5-div-u24.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-lut8-p4h3ts-div-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-lut8-p4h3ts-div-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-lut8-p4h3ts-div-u12.c.o [ 27%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesConvolution.cpp.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-lut8-p4h3ts-div-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-div-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-div-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-div-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-div-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr1-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr1-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr1-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr1-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr2-u4.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr2-u8.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr2-u12.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse2-expm1minus-rr1-p6h5ts-nr2-u16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-f32-cvt-sse2-int16.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-f32-cvt-sse2-int32.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-sse2-rr2-lut64-p2.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-sse2-rr2-p5.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-sse2-rr2-lut16-p3.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-sse2-rr2-p6.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expminus-sse2-rr2-p5.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-f16-cvt-sse2.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundd-sse2-cvt.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundne-sse2-cvt.c.o [ 27%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundu-sse2-cvt.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundz-sse2-cvt.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-sse2-rr2-lut64-p2-div.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-sse2-rr2-lut64-p2-nr1.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-sse2-rr2-lut64-p2-nr2.c.o [ 28%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesDecompositions.cpp.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-sse2-rr2-p5-div.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-sse2-rr2-p5-nr1.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-sse2-rr2-p5-nr2.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr1-lut8-p4h3ps-div.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr1-p6h5ts-div.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr1-p6h5ts-nr1.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr1-p6h5ts-nr2.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr2-lut8-p4h2ts-nr1.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr2-lut8-p4h2ts-nr2.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr2-lut8-p4h3ps-nr1.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr2-lut8-p4h3ps-nr2.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr2-lut8-p4h3ts-nr1.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-sse2-expm1minus-rr2-lut8-p4h3ts-nr2.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-sse2-ld64.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-sse2-ld128.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesDynamic.cpp.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-sse2-mul16-add16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-sse2-mul16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse2-u8.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse2-u16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse2-u24.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse2-u32.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-sse2-c8.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-sse2-c16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-sse2-c24.c.o [ 28%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesFactory.cpp.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-sse2-c8.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-sse2-c16.c.o [ 28%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-sse2-c24.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p8c-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c8s8r-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-sse2-mul16-add16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-sse2-mul16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesHelper.cpp.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesLinearAlgebra.cpp.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-sse2-ld64.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-sse2-ld128.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-fp32-sse2.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-gemmlowp-sse2.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndna-sse2.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse2-mul16-ld64-u8.c.o [ 29%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesLoss.cpp.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse2-mul16-ld64-u16.c.o [ 29%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse2-mul16-ld64-u24.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse2-mul16-ld64-u32.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse2-mul16-ld64-u8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse2-mul16-ld64-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse2-mul16-ld64-u24.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse2-mul16-ld64-u32.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-sse2-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-sse2-u32.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-sse2-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-sse2-u32.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-sse2-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-sse2-u32.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-sse2-mul16-ld64-u8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-sse2-mul16-ld64-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-sse2-mul16-ld64-u8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-sse2-mul16-ld64-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-sse2-u4.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-sse2-u8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-sse2-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-avgpool/qu8-avgpool-9p8x-minmax-fp32-sse2-c8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-avgpool/qu8-avgpool-9x-minmax-fp32-sse2-c8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 30%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesModules.cpp.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l8c8s8r-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c8s8r-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-sse2-mul16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse2-u8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse2-u16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse2-u24.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse2-u32.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-sse2-c8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-sse2-c16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-sse2-c24.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-sse2-c8.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-sse2-c16.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-sse2-c24.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-sse2-ld128.c.o [ 30%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesNorm.cpp.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-sse2-ld128.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-sse2-ld128.c.o [ 30%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesPooling.cpp.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-sse2-ld64.c.o [ 30%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-sse2-ld64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-sse2-ld128.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-fp32-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-gemmlowp-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-rndna-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-sse2-mul16-ld64-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-sse2-mul16-ld64-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-sse2-mul16-ld64-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-sse2-mul16-ld64-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-sse2-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-sse2-u32.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-sse2-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-sse2-u32.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-sse2-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-sse2-u32.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-sse2-mul16-ld64-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-sse2-mul16-ld64-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-sse2-mul16-ld64-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-sse2-mul16-ld64-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-sse2-c8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-sse2-c16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-maxpool/s8-maxpool-9p8x-minmax-sse2-c16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-vclamp/s8-vclamp-sse2-u64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-sse2-c8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-sse2-c16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-maxpool/u8-maxpool-9p8x-minmax-sse2-c16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-rmax/u8-rmax-sse2-u16.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-vclamp/u8-vclamp-sse2-u64.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-16x16-reuse-mov-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-16x16-reuse-switch-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-x2-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-x3-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-x4-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-zip/x8-zip-xm-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-8x8-multi-mov-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-8x8-multi-switch-sse2.c.o [ 31%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesRandomness.cpp.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-8x8-reuse-mov-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-8x8-reuse-multi-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-8x8-reuse-switch-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/x16-transposec-4x8-sse2.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x2c4-gemm-goi-sse2-u4-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x2c4-gemm-goi-sse2-u4.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-sse2-u4-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-sse2-u4.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-sse2-u8-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-sse2-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8s4-gemm-goi-sse2-u4-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8s4-gemm-goi-sse2-u4.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8s4-gemm-goi-sse2-u8-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8s4-gemm-goi-sse2-u8.c.o [ 31%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesReduceOps.cpp.o [ 31%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesScatterOps.cpp.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-sse2-u4-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-sse2-u4.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-sse2-u8-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-sse2-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16s4-gemm-goi-sse2-u4-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16s4-gemm-goi-sse2-u4.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16s4-gemm-goi-sse2-u8-prfm.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16s4-gemm-goi-sse2-u8.c.o [ 31%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-multi-mov-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-multi-multi-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-multi-switch-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-reuse-mov-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-reuse-multi-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-4x4-reuse-switch-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-unpool/x32-unpool-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-x2-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-x3-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-x4-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-zip/x32-zip-xm-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-multi-mov-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-multi-multi-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-multi-switch-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-reuse-mov-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-reuse-multi-sse2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-2x2-reuse-switch-sse2.c.o [ 32%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesUnaryOps.cpp.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/xx-fill/xx-fill-sse2-u64.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/xx-pad/xx-pad-p16-sse2-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-1x4-acc2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-1x4-acc3.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-1x4-acc4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-1x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-2x4-acc2.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-2x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-3x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-4x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-5x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv2d-chw/gen/f32-dwconv2d-chw-3x3p1-minmax-ssse3-6x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-gemmlowp-ssse3.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndna-ssse3.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-ssse3-u4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-ssse3-u8.c.o [ 32%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchRulesViews.cpp.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-gemmlowp-ssse3.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-rndna-ssse3.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-ssse3-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-ssse3-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x24-transposec/x24-transposec-4x4-ssse3.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int16-u8.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int16-u16.c.o [ 32%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchedFallback.cpp.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int16-u24.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int16-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int32-u8.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int32-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int32-u24.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-sse41-int32-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse41-u8.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse41-u16.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse41-u24.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-sse41-u32.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-sse41-2x4.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-sse41-2x8.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-1x8-minmax-sse41-dup.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-3x8-minmax-sse41-dup.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x8-minmax-sse41-dup.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-5x8-minmax-sse41-dup.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-6x8-minmax-sse41-dup.c.o [ 32%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8-minmax-sse41-dup.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8-minmax-sse41-load1.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8s4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x8-minmax-sse41-dup.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x8-minmax-sse41-load1.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x8s4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x2c4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8-minmax-sse41-dup.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8-minmax-sse41-load1.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8s4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8-minmax-sse41-dup.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8-minmax-sse41-load1.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8s4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x2c4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8-minmax-sse41-dup.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8-minmax-sse41-load1.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8s4-minmax-sse41.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse41-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse41-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse41-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-sse41-u32.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-lut16-p3-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-lut16-p3-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-lut16-p3-u12.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-lut16-p3-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-lut16-p3-u20.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-lut16-p3-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-p6-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-p6-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-p6-u12.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-p6-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-p6-u20.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-sse41-rr2-p6-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-sse41-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-sse41-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-sse41-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-sse41-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-sse41-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-sse41-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-sse41-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-sse41-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-sse41-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-sse41-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-lut64-p2-div-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-lut64-p2-div-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-lut64-p2-div-u12.c.o [ 33%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/BatchedTensorImpl.cpp.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-lut64-p2-div-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-lut64-p2-div-u20.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-lut64-p2-div-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-p5-div-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-p5-div-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-p5-div-u12.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-p5-div-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-p5-div-u20.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-sse41-rr2-p5-div-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-lut8-p4h3ts-div-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-lut8-p4h3ts-div-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-lut8-p4h3ts-div-u12.c.o [ 33%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/DynamicLayer.cpp.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-lut8-p4h3ts-div-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-lut8-p4h3ts-div-u20.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-lut8-p4h3ts-div-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-div-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-div-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-div-u12.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-div-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-div-u20.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-div-u24.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr1-u4.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr1-u8.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr1-u12.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr1-u16.c.o [ 33%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr1-u20.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr1-u24.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr2-u4.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr2-u8.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr2-u12.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr2-u16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr2-u20.c.o [ 34%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/FunctionalizeInterpreter.cpp.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-sse41-expm1minus-rr1-p6h5ts-nr2-u24.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-f32-cvt-sse41-int16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-f32-cvt-sse41-int32.c.o [ 34%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/Interpreter.cpp.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-f16-cvt-sse41.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundd-sse41.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundne-sse41.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundu-sse41.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-roundz-sse41.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-sse41-ld64.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-sse41-ld128.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 34%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp.o [ 34%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/LegacyVmapTransforms.cpp.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 34%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/PlumbingHelper.cpp.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-sse41-mul32.c.o [ 34%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/PyTorchOperatorHacks.cpp.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-sse41-mul16-add16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-sse41-mul16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-sse41-mul32.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse41-u8.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse41-u16.c.o [ 34%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse41-u24.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-sse41-u32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-sse41-c8.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-sse41-c16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7p7x-minmax-fp32-sse41-c24.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-sse41-c8.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-sse41-c16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-gavgpool/gen/qs8-gavgpool-7x-minmax-fp32-sse41-c24.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p8c-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 35%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/TensorWrapper.cpp.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c8s8r-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-sse41-mul16-add16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-sse41-mul16.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-sse41-mul32.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-sse41-ld64.c.o [ 35%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/VmapInterpreter.cpp.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-sse41-ld64.c.o [ 35%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/functorch/VmapModeRegistrations.cpp.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-sse41-ld128.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-sse41-ld64.c.o [ 35%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-sse41-ld64.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-sse41-ld64.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-sse41-ld64.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-sse41-ld64.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-sse41-ld64.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-sse41-ld64.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-sse41-ld128.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-fp32-sse41.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-gemmlowp-sse41.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndna-sse41.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndnu-sse41-sra.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-requantization/qs8-requantization-rndnu-sse41-srl.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul16-ld64-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul16-ld64-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul16-ld64-u24.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul16-ld64-u32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul32-ld32-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul32-ld32-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul32-ld32-u24.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-sse41-mul32-ld32-u32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul16-ld64-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul16-ld64-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul16-ld64-u24.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul16-ld64-u32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul32-ld32-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul32-ld32-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul32-ld32-u24.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-sse41-mul32-ld32-u32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-sse41-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-sse41-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-sse41-u32.c.o [ 36%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/record_function.cpp.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-sse41-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-sse41-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-sse41-u32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-sse41-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-sse41-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-sse41-u32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-sse41-mul16-ld64-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-sse41-mul16-ld64-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-sse41-mul16-ld64-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-sse41-mul16-ld64-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-sse41-u4.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-sse41-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-sse41-u16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 36%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/ATenGeneral.cpp.o [ 36%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/BackendSelectFallbackKernel.cpp.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l8c4s4r-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l8c8s8r-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c4s4r-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c8s8r-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-sse41-mul16.c.o [ 36%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/DeprecatedTypeProperties.cpp.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-sse41-mul16.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-sse41-mul32.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse41-u8.c.o [ 36%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse41-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse41-u24.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/DeprecatedTypePropertiesRegistry.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-sse41-u32.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-sse41-c8.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-sse41-c16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7p7x-minmax-fp32-sse41-c24.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-sse41-c8.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Dict.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-sse41-c16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gavgpool/gen/qu8-gavgpool-7x-minmax-fp32-sse41-c24.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Dimname.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Formatting.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-sse41-ld128.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-sse41-ld64.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-sse41-ld128.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Generator.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-gemmlowp-sse41.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-requantization/qu8-requantization-rndna-sse41.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-sse41-mul16-ld64-u8.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-sse41-mul16-ld64-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-sse41-mul32-ld32-u8.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/GeneratorForPrivateuseone.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-sse41-mul32-ld32-u16.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/List.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-sse41-mul16-ld64-u8.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-sse41-mul16-ld64-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-sse41-mul32-ld32-u8.c.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/MetaFallbackKernel.cpp.o [ 37%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/NamedRegistrations.cpp.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-sse41-mul32-ld32-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-sse41-u8.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-sse41-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-sse41-u32.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-sse41-u8.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-sse41-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-sse41-u32.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-sse41-u8.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-sse41-u16.c.o [ 37%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-sse41-u32.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-sse41-mul16-ld64-u8.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/NamedTensor.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-sse41-mul16-ld64-u16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-sse41-mul16-ld64-u8.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-sse41-mul16-ld64-u16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-sse41-c8.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-ibilinear/gen/s8-ibilinear-sse41-c16.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/NestedIntSymNodeImpl.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-maxpool/s8-maxpool-9p8x-minmax-sse41-c16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/s8-vclamp/s8-vclamp-sse41-u64.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-sse41-c8.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/u8-ibilinear/gen/u8-ibilinear-sse41-c16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int16-u8.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int16-u16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int16-u24.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int16-u32.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int32-u8.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int32-u16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int32-u24.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx-int32-u32.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/PythonFallbackKernel.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p8c-minmax-avx-acc2.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/PythonOpRegistrationTrampoline.cpp.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Range.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p8c-minmax-avx.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Tensor.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p16c-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p16c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p8c-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p8c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p16c-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p16c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l8c8s4r-minmax-avx-acc2.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/TorchDispatchUtils.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l8c8s4r-minmax-avx.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/VariableFallbackKernel.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c8s4r-minmax-avx-acc2.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/VariableHooksInterface.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c8s4r-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l8c8s4r-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l8c8s4r-minmax-avx.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/Vitals.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l16c8s4r-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-6f6m7l16c8s4r-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l8c8s4r-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l8c8s4r-minmax-avx.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/adaption.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l16c8s4r-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-8f8m9l16c8s4r-minmax-avx.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/blob.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p8c-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p8c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p16c-minmax-avx-acc2.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/boxing/KernelFunction.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p16c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p8c-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p8c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p16c-minmax-avx-acc2.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p16c-minmax-avx.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-avx-u8.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-avx-u16.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-avx-u24.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-avx-u32.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x16-minmax-avx-broadcast.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/class_type.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-7x8-minmax-avx-broadcast.c.o [ 38%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/custom_class.cpp.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x8-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x16-minmax-avx-broadcast.c.o [ 38%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-7x8-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x8-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x8-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x8-minmax-avx-broadcast.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/dispatch/DispatchKeyExtractor.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x8-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-7x8-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-avx-2x8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-avx-2x16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-1x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-2x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-3x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-5x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-6x16-minmax-avx-broadcast.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/dispatch/Dispatcher.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-7x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-8x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x16-minmax-avx-broadcast.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/dispatch/ObservedOperators.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x16-minmax-avx-broadcast.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/dispatch/OperatorEntry.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-7x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-8x16-minmax-avx-broadcast.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx-u24.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx-u32.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx-u24.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx-u32.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx-u16-acc2.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx-u24-acc3.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx-u32-acc2.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/dynamic_type.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx-u32-acc4.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx-u16-acc2.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/function_schema.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx-u24-acc3.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx-u32-acc2.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx-u32-acc4.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/interned_strings.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx-u16-acc2.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx-u24-acc3.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx-u32-acc2.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx-u32-acc4.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx-u16-acc2.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/ivalue.cpp.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/library.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx-u24-acc3.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx-u32-acc2.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx-u32-acc4.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/op_registration/infer_schema.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-avx-u8.c.o [ 39%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/op_registration/op_registration.cpp.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-avx-u8.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-avx-u16.c.o [ 39%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-avx-u8.c.o [ 40%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-avx-u16.c.o [ 40%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-avx-u8.c.o [ 40%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-avx-u16.c.o [ 40%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-avx-u8.c.o [ 40%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/operator_name.cpp.o [ 40%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-avx-u16.c.o [ 40%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/register_symbols.cpp.o [ 40%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-avx-u8.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/tensor_type.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-avx-u8.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/type.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-avx-u16.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/type_factory.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-avx-u16.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/union_type.cpp.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/error_report.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut4-p4-perm-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut4-p4-perm-u16.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/function_schema_parser.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut4-p4-perm-u24.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut4-p4-perm-u32.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut4-p4-perm-u40.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut4-p4-perm-u48.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut16-p3-u8.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/lexer.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut16-p3-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut16-p3-u24.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/schema_type_parser.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut16-p3-u32.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut16-p3-u40.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/strtod.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-lut16-p3-u48.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/source_range.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-p6-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-p6-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-p6-u24.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-p6-u32.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-p6-u40.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Activation.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx-rr2-p6-u48.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-avx-u16.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AdaptiveAveragePooling.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-avx-u8.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AdaptiveAveragePooling3d.cpp.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AdaptiveMaxPooling2d.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-avx-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-avx-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-avx-rsqrt-u8.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-avx-rsqrt-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-avx-rsqrt-u32.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u8.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AdaptiveMaxPooling3d.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u24.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u32.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u40.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u48.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u56.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u64.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u72.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-div-u80.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u8.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AffineGridGenerator.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u16.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u24.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u32.c.o [ 41%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AmpKernels.cpp.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u40.c.o [ 41%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u48.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u56.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u64.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u72.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx-rr2-p5-nr2-u80.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-avx-sqrt-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-avx-sqrt-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-avx-sqrt-u32.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u24.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u32.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AutogradComposite.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u40.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u48.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AveragePool2d.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u56.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u64.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u72.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div-u80.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut8-p4h3ts-div-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut8-p4h3ts-div-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut8-p4h3ts-div-u24.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-lut8-p4h3ts-div-u32.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u24.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u32.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u40.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u48.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/AveragePool3d.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u56.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u64.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/BatchLinearAlgebra.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u72.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-div-u80.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u24.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u32.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/BatchLinearAlgebraKernel.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u40.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u48.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u56.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u64.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u72.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr1-u80.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/BinaryOps.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u24.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u32.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u40.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u48.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u56.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u64.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u72.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx-expm1minus-rr1-p6h5ts-nr2-u80.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Blas.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-avx-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-avx-u16.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/BlasKernel.cpp.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Bucketization.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-avx-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-avx-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-avx-u8.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-avx-u16.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx-rr2-p5.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/CPUBlas.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx-rr2-lut4-p4-perm.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx-rr2-lut16-p3.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx-rr2-p6.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx-rr2-lut64-p2-div.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx-rr2-p5-div.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx-rr2-p5-nr1.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/CPUFallback.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx-rr2-p5-nr2.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr1-lut4-p4h2ts-perm-div.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ChanelShuffle.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr1-lut8-p4h3ps-div.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr1-p6h5ts-div.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr1-p6h5ts-nr1.c.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr1-p6h5ts-nr2.c.o [ 42%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Col2Im.cpp.o [ 42%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr2-lut8-p4h2ts-nr1.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr2-lut8-p4h2ts-nr2.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr2-lut8-p4h3ps-nr1.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr2-lut8-p4h3ps-nr2.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr2-lut8-p4h3ts-nr1.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ComparisonUtils.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx-expm1minus-rr2-lut8-p4h3ts-nr2.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-avx-ld128.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Constraints.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-avx-ld64.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Convolution.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-avx-ld64.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ConvolutionMM2d.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-avx-ld64.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ConvolutionMM3d.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-avx-ld128.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ConvolutionTBC.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-avx-ld128.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-avx-ld64.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-avx-ld128.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Copy.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx-mul16.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Correlation.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx-u8.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx-u16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx-u24.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx-u32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p16c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Cross.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c4s4r-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/DilatedMaxPool2d.cpp.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx-mul16-add16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx-mul16.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx-mul32.c.o [ 43%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-avx-ld128.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/DilatedMaxPool3d.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-avx-ld128.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/DispatchStub.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Distance.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-avx-ld64.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-avx-ld128.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Distributions.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-avx-ld64.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Dropout.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-avx-ld128.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul16-ld64-u8.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul16-ld64-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul16-ld64-u24.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul16-ld64-u32.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul32-ld32-u8.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul32-ld32-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul32-ld32-u24.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx-mul32-ld32-u32.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul16-ld64-u8.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul16-ld64-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul16-ld64-u24.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul16-ld64-u32.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul32-ld32-u8.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Embedding.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul32-ld32-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul32-ld32-u24.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx-mul32-ld32-u32.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/EmbeddingBag.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-avx-u8.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-avx-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-avx-u32.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-avx-u8.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-avx-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vhswish/gen/qs8-vhswish-avx-u32.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-avx-u8.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-avx-u16.c.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-avx-u32.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Fill.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-avx-mul16-ld64-u8.c.o [ 44%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ForeachOpsKernels.cpp.o [ 44%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmul/gen/qs8-vmul-minmax-fp32-avx-mul16-ld64-u16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-avx-mul16-ld64-u8.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vmulc/gen/qs8-vmulc-minmax-fp32-avx-mul16-ld64-u16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-avx-u4.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-avx-u8.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs16-qs8-vcvt/gen/qs16-qs8-vcvt-avx-u16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l8c4s4r-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c4s4r-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l8c4s4r-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c4s4r-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l8c4s4r-minmax-fp32-avx-mul32.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/FractionalMaxPool2d.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c4s4r-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-avx-mul16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-avx-mul32.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/FractionalMaxPool3d.cpp.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/FunctionOfAMatrixUtils.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-avx-mul16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-avx-mul16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-avx-mul16.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/GatedLinearUnit.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-avx-mul32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx-u8.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx-u16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx-u24.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/GridSampler.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx-u32.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Histogram.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Im2Col.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/IndexingUtils.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Integration.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-avx-ld64.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Itertools.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-avx-ld64.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LegacyBatching.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-avx-ld128.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-avx-ld64.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-avx-ld128.c.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LegacyBridge.cpp.o [ 45%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Lerp.cpp.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx-mul16-ld64-u8.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx-mul16-ld64-u16.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx-mul32-ld32-u8.c.o [ 45%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx-mul32-ld32-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx-mul16-ld64-u8.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Linear.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx-mul16-ld64-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx-mul32-ld32-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx-mul32-ld32-u16.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LinearAlgebra.cpp.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Loss.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-avx-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-avx-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-avx-u32.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-avx-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-avx-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vhswish/gen/qu8-vhswish-avx-u32.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-avx-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-avx-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-avx-u32.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-avx-mul16-ld64-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmul/gen/qu8-vmul-minmax-fp32-avx-mul16-ld64-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-avx-mul16-ld64-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vmulc/gen/qu8-vmulc-minmax-fp32-avx-mul16-ld64-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx-u32.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx-u48.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LossCTC.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx-u64.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-avx-u4-prfm.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LossMultiLabelMargin.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8-gemm-goi-avx-u4.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8s4-gemm-goi-avx-u4-prfm.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x8s4-gemm-goi-avx-u4.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-avx-u4-prfm.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-avx-u4.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16s4-gemm-goi-avx-u4-prfm.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16s4-gemm-goi-avx-u4.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-8x8-multi-mov-avx.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LossMultiMargin.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-8x8-multi-switch-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-8x8-reuse-mov-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-8x8-reuse-multi-avx.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LossNLL.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-transposec/gen/x32-transposec-8x8-reuse-switch-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x4-multi-mov-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x4-multi-multi-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x4-multi-switch-avx.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/LossNLL2d.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x4-reuse-mov-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x4-reuse-multi-avx.c.o [ 46%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/MaxPooling.cpp.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x64-transposec/gen/x64-transposec-4x4-reuse-switch-avx.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-avgpool/f16-avgpool-9p8x-minmax-f16c-c8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-avgpool/f16-avgpool-9x-minmax-f16c-c8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-f16c-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-f16c-u16.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-rsum/gen/f16-f32acc-rsum-f16c-u8.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-rsum/gen/f16-f32acc-rsum-f16c-u16-acc2.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-rsum/gen/f16-f32acc-rsum-f16c-u24-acc3.c.o [ 46%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-rsum/gen/f16-f32acc-rsum-f16c-u32-acc2.c.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/MaxUnpooling.cpp.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-rsum/gen/f16-f32acc-rsum-f16c-u32-acc4.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7p7x-minmax-f16c-c8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7p7x-minmax-f16c-c16.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7p7x-minmax-f16c-c24.c.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Memory.cpp.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/MetaTensor.cpp.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7p7x-minmax-f16c-c32.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7x-minmax-f16c-c8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7x-minmax-f16c-c16.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7x-minmax-f16c-c24.c.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/NNPACK.cpp.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/NaiveConvolutionTranspose2d.cpp.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gavgpool/gen/f16-gavgpool-7x-minmax-f16c-c32.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-maxpool/f16-maxpool-9p8x-minmax-f16c-c8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-prelu/gen/f16-prelu-f16c-2x8.c.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/NaiveConvolutionTranspose3d.cpp.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-prelu/gen/f16-prelu-f16c-2x16.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-rminmax/f16-rmax-f16c-u32.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vadd-minmax-f16c-u8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vadd-minmax-f16c-u16.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vaddc-minmax-f16c-u8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vaddc-minmax-f16c-u16.c.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/NaiveDilatedConvolution.cpp.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vdiv-minmax-f16c-u8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vdiv-minmax-f16c-u16.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vdivc-minmax-f16c-u8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vdivc-minmax-f16c-u16.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmax-f16c-u8.c.o [ 47%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmax-f16c-u16.c.o [ 47%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/NamedTensor.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmaxc-f16c-u8.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/NegateFallback.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmaxc-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmin-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmin-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vminc-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vminc-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmul-minmax-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmul-minmax-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmulc-minmax-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vmulc-minmax-f16c-u16.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Normalization.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vrdivc-minmax-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vrdivc-minmax-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vrsubc-minmax-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vrsubc-minmax-f16c-u16.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Onehot.cpp.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/PackedSequence.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsqrdiff-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsqrdiff-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsqrdiffc-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsqrdiffc-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsub-minmax-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsub-minmax-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsubc-minmax-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vbinary/gen/f16-vsubc-minmax-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vclamp/gen/f16-vclamp-f16c-u8.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/PadNd.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vclamp/gen/f16-vclamp-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vhswish/gen/f16-vhswish-f16c-u8.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/PixelShuffle.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vhswish/gen/f16-vhswish-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vlrelu/gen/f16-vlrelu-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vlrelu/gen/f16-vlrelu-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndd-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndd-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndne-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndne-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndu-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndu-f16c-u16.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/PointwiseOps.cpp.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Pooling.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndz-f16c-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vrnd/gen/f16-vrndz-f16c-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsqrt/gen/f16-vsqrt-f16c-rsqrt-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsqrt/gen/f16-vsqrt-f16c-rsqrt-u16.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Pow.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsqrt/gen/f16-vsqrt-f16c-rsqrt-u32.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsqrt/gen/f16-vsqrt-f16c-sqrt-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsqrt/gen/f16-vsqrt-f16c-sqrt-u16.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/QuantizedLinear.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsqrt/gen/f16-vsqrt-f16c-sqrt-u32.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u8.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/RNN.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u24.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u32.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u40.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u48.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/RangeFactories.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u56.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u64.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u72.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-div-u80.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u24.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ReduceAllOps.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u32.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u40.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u48.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u56.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u64.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u72.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ReduceOps.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-expm1minus-rr1-p3h2ts-rcp-u80.c.o [ 48%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ReflectionPad.cpp.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u8.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u16.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u24.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u32.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u40.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u48.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u56.c.o [ 48%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u72.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-f16c-polynomial-p19h9t2-u80.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Repeat.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vunary/gen/f16-vsqr-f16c-u8.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vunary/gen/f16-vsqr-f16c-u16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-f16c-u8.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-f16c-u16.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ReplicationPadding.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-f32-cvt-f16c.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-f16-cvt-f16c.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-f16c-expm1minus-rr1-p3h2ts-div.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-f16c-expm1minus-rr1-p3h2ts-rcp.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-f16c-polynomial-p17h8t2.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-f16c-polynomial-p19h9t2.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-xop-ld64.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Resize.cpp.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/RowwisePrune.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x4c8-minmax-xop-ld128.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Scalar.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-xop-ld64.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SegmentReduce.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-xop-ld64.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SobolEngineOps.cpp.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SobolEngineOpsUtils.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x4c8-minmax-xop-ld128.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SoftMax.cpp.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Sorting.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x4c8-minmax-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SparseTensorUtils.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SpectralOps.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/SummaryOps.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p16c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorAdvancedIndexing.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorCompare.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c4s4r-minmax-fp32-xop-mul32.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorConversions.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-xop-mul16-add16.c.o [ 49%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorFactories.cpp.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-xop-mul32.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2-minmax-fp32-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c2s4-minmax-fp32-xop-ld128.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-xop-ld64.c.o [ 49%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorIteratorReduce.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorProperties.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorShape.cpp.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TensorTransformations.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TestOps.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TriangularOps.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/TypeProperties.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-xop-mul32-ld32-u8.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UnaryOps.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-xop-mul32-ld32-u16.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-xop-mul32-ld32-u24.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Unfold2d.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-xop-mul32-ld32-u32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-xop-mul32-ld32-u8.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Unfold3d.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-xop-mul32-ld32-u16.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-xop-mul32-ld32-u24.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-xop-mul32-ld32-u32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l8c4s4r-minmax-fp32-xop-mul32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c4s4r-minmax-fp32-xop-mul32.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UnfoldBackward.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l8c4s4r-minmax-fp32-xop-mul32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c4s4r-minmax-fp32-xop-mul32.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Unique.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l8c4s4r-minmax-fp32-xop-mul32.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSample.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c4s4r-minmax-fp32-xop-mul32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-xop-mul32.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleBicubic2d.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-xop-mul32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-xop-mul32.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleBilinear2d.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-xop-mul32.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleLinear1d.cpp.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleNearest1d.cpp.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c2s4-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x4c8-minmax-fp32-xop-ld128.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-xop-ld64.c.o [ 50%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2-minmax-fp32-xop-ld128.c.o [ 50%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleNearest2d.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c2s4-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-xop-ld64.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleNearest3d.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x4c8-minmax-fp32-xop-ld128.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/UpSampleTrilinear3d.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x4c2s4-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2-minmax-fp32-xop-ld128.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/VariableMethodStubs.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c2s4-minmax-fp32-xop-ld128.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/WeightNorm.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-xop-ld64.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/group_norm.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x4c8-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2-minmax-fp32-xop-ld128.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/layer_norm.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c2s4-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x4c8-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-xop-ld64.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/prim_native_functions.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-xop-ld64.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/verbose_wrapper.cpp.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/library.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c2s4-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-xop-ld64.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/fbgemm_utils.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x4c8-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-xop-ld64.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-xop-ld64.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x4c2s4-minmax-fp32-xop-ld128.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-xop-mul32-ld32-u8.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-xop-mul32-ld32-u16.c.o [ 51%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_deserialize.cpp.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-xop-mul32-ld32-u8.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-xop-mul32-ld32-u16.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-3p8c-minmax-fma3-acc2.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-3p8c-minmax-fma3.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-3p16c-minmax-fma3-acc2.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-3p16c-minmax-fma3.c.o [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-3p32c-minmax-fma3-acc2.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_dynamic.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-3p32c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-4p8c-minmax-fma3-acc2.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_prepack.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-4p8c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-4p16c-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-4p16c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-4p32c-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-4p32c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-5f5m5l8c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-5f5m5l8c8s4r-minmax-fma3.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_serialize.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-5f5m5l16c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-5f5m5l16c8s4r-minmax-fma3.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_unpack.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-5f5m5l32c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-5f5m5l32c8s4r-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-6f6m7l8c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-6f6m7l8c8s4r-minmax-fma3.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/FlattenIndicesKernel.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-6f6m7l16c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-6f6m7l16c8s4r-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-6f6m7l32c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-6f6m7l32c8s4r-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-8f8m9l8c8s4r-minmax-fma3-acc2.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/ParamUtils.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-8f8m9l8c8s4r-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-8f8m9l16c8s4r-minmax-fma3-acc2.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SoftMax.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-8f8m9l16c8s4r-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-8f8m9l32c8s4r-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-8f8m9l32c8s4r-minmax-fma3.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseBinaryOpIntersectionKernel.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-9p8c-minmax-fma3-acc2.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseBlas.cpp.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-9p8c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-9p16c-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-9p16c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-9p32c-minmax-fma3-acc2.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-9p32c-minmax-fma3.c.o [ 52%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-25p8c-minmax-fma3-acc2.c.o [ 52%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseBlasImpl.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-25p8c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-25p16c-minmax-fma3-acc2.c.o [ 53%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseCsrTensor.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-25p16c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-25p32c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-dwconv/gen/f16-dwconv-25p32c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-ibilinear/gen/f16-ibilinear-fma3-c8.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-ibilinear/gen/f16-ibilinear-fma3-c16.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vmulcaddc/gen/f16-vmulcaddc-c8-minmax-fma3-2x.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vmulcaddc/gen/f16-vmulcaddc-c16-minmax-fma3-2x.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u8.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u16.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u24.c.o [ 53%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseCsrTensorMath.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u32.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u40.c.o [ 53%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseFactories.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u48.c.o [ 53%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseMatMul.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u56.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u64.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u72.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-div-u80.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u8.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u16.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u24.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u32.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u40.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u48.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u56.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u64.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u72.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-expm1minus-rr1-p3h2ts-rcp-u80.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u8.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u16.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u24.c.o [ 53%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseTensor.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u32.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u40.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u48.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u56.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u64.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u72.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-fma3-polynomial-p19h9t2-u80.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p8c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p8c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p16c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p16c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p8c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p8c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p16c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p16c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l8c8s4r-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l8c8s4r-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c8s4r-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c8s4r-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l32c8s4r-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l32c8s4r-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-7f6m6l8c8s4r-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-7f6m6l8c8s4r-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-7f6m6l16c8s4r-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-7f6m6l16c8s4r-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-7f6m6l32c8s4r-minmax-fma3-acc2.c.o [ 53%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseTensorMath.cpp.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-7f6m6l32c8s4r-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p8c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p8c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p16c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p16c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p8c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p8c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p16c-minmax-fma3-acc2.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p16c-minmax-fma3.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x8-minmax-fma3-broadcast.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x16-minmax-fma3-broadcast.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x16s4-minmax-fma3-broadcast.c.o [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-3x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x16s4-minmax-fma3-broadcast.c.o [ 54%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/SparseUnaryOps.cpp.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-7x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-8x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-3x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-7x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-8x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x8-minmax-fma3-broadcast.c.o [ 54%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/sparse/ValidateCompressedIndicesKernel.cpp.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-3x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x16-minmax-fma3-broadcast-prfm.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x8-minmax-fma3-broadcast.c.o [ 54%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorAliases.cpp.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x16-minmax-fma3-broadcast-prfm.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x16s4-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-7x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-8x8-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-1x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-2x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-3x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-5x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-6x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-7x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-8x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x16-minmax-fma3-broadcast.c.o [ 54%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorBackward.cpp.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x16-minmax-fma3-broadcast.c.o [ 54%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorBinaryOps.cpp.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-7x16-minmax-fma3-broadcast.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-8x16-minmax-fma3-broadcast.c.o [ 54%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorFactories.cpp.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-fma3-u8.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-fma3-u16.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-fma3-rsqrt-u8.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-fma3-rsqrt-u16.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-fma3-rsqrt-u32.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-fma3-nr1fma1adj-u8.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-fma3-nr1fma1adj-u16.c.o [ 54%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-fma3-nr1fma1adj-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u8.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u16.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u24.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u40.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u48.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u56.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorMath.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u64.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorMatmul.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u72.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorTransformerFunctions.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div-u80.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u8.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u16.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u24.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u40.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u48.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u56.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u64.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u72.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u80.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-div-u8.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-div-u16.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-div-u24.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-div-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-nr1adj-u8.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorUnaryOps.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-nr1adj-u16.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/nested/NestedTensorUtils.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-nr1adj-u24.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-lut8-p4h3ts-nr1adj-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u8.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/AffineQuantizer.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u16.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u24.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u40.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u48.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/AffineQuantizerBase.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u56.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/Copy.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u64.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/FakeQuantPerChannelAffine.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u72.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-div-u80.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u8.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u16.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u24.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/FakeQuantPerTensorAffine.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u40.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u48.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u56.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u64.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u72.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1-u80.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/QTensor.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u8.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u16.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u24.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u32.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u40.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/TensorAdvancedIndexing.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u48.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u56.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u64.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u72.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/TensorCompare.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-fma3-expm1minus-rr1-p6h5ts-nr1adj-u80.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-fma3-nr1fma1adj.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-fma3-nr1fma.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-fma3-nr2fma.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-fma3-expm1minus-rr1-p3h2ts-div.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-fma3-expm1minus-rr1-p3h2ts-rcp.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-fma3-polynomial-p17h8t2.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-fma3-polynomial-p19h9t2.c.o [ 55%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/TensorFactories.cpp.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-div.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-lut8-p4h3ps-div.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-lut8-p4h3ps-nr1.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-lut8-p4h3ps-nr1adj.c.o [ 55%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-p6h5ts-div.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-p6h5ts-nr1.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/AdaptiveAveragePooling.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-fma3-expm1minus-rr1-p6h5ts-nr1adj.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/AveragePool2d.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-1x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-1x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-3x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-4x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-4x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-5x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-5x16-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/AveragePool3d.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-6x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-gemm/gen/f16-f32acc-gemm-7x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-1x8-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/BinaryOps.cpp.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/ChannelShuffle.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-1x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-3x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-4x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-4x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-5x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-5x16-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/IntReprQuant.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-6x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32acc-igemm/gen/f16-f32acc-igemm-7x8-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/LinearUnpackImpl.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-1x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-1x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-3x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-4x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-4x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-5x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-5x16-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/MakePerTensorQuantizedTensor.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-6x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-gemm/gen/f16-gemm-7x8-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/Normalization.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-1x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-1x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-3x16-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-4x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-4x16-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/Pooling.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-5x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-5x16-minmax-avx2-broadcast.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/ReduceOps.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-6x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-igemm/gen/f16-igemm-7x8-minmax-avx2-broadcast.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-pavgpool/f16-pavgpool-9p8x-minmax-avx2-c8.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-pavgpool/f16-pavgpool-9x-minmax-avx2-c8.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u32-acc2.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u32-acc4.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/RuyUtils.cpp.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/Sorting.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u32.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u40-acc2.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u40-acc5.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u40.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/TensorOperators.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u48-acc2.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u48-acc3.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u48.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u64-acc2.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u64-acc4.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/TensorShape.cpp.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/UpSampleBilinear2d.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u64.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u72-acc3.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u72.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/UpSampleNearest2d.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u80-acc2.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u80-acc5.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u80.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u96-acc2.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u96-acc3.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u96-acc6.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-raddstoreexpminusmax/gen/f16-raddstoreexpminusmax-avx2-rr1-p2-u96.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-velu/gen/f16-velu-avx2-rr1-p3-u8.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/UpSampleNearest3d.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-velu/gen/f16-velu-avx2-rr1-p3-u16.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u8.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u16.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u24.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u32.c.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/XnnpackUtils.cpp.o [ 56%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/fbgemm_utils.cpp.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u40.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u48.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u56.c.o [ 56%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-div-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u8.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u16.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u24.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u32.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/fused_obs_fake_quant.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u40.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u48.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u56.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vsigmoid/gen/f16-vsigmoid-avx2-rr1-p2-rcp-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u8.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u16.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/init_qnnpack.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u24.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u32.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qclamp.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u40.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qconv.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u48.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u56.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u72.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-div-u80.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u8.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u16.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u24.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u32.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u40.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u48.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qconv_dynamic.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u56.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u72.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-vtanh/gen/f16-vtanh-avx2-expm1minus-rr1-p3h2ts-rcp-u80.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-1x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-2x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-3x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-4x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-5x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-6x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-7x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc4w-gemm/gen/f32-qc4w-gemm-8x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x8-minmax-avx2-broadcast.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qconv_unpack_impl.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x16-minmax-avx2-broadcast.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qdropout.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x16s4-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x16s4-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x16s4-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x8-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x16s4-minmax-avx2-broadcast.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qelu.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x8-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x16s4-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x8-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x16s4-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-7x8-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-7x16-minmax-avx2-broadcast.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qembeddingbag.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-8x8-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-8x16-minmax-avx2-broadcast.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx2-u16.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx2-u32.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx2-u48.c.o [ 57%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qembeddingbag_prepack.cpp.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx2-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx2-u16.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx2-u32.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx2-u48.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx2-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u64-acc2.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u64-acc4.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u64.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u72-acc3.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u72.c.o [ 57%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u80-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u80-acc5.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u80.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u96-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u96-acc3.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u96-acc6.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx2-p5-u96.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u64-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u64-acc4.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u64.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u72-acc3.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u72.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u80-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u80-acc5.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u80.c.o [ 58%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qembeddingbag_unpack.cpp.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u96-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u96-acc3.c.o [ 58%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qgelu.cpp.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u96-acc6.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx2-p5-u96.c.o [ 58%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qhardsigmoid.cpp.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u64-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u64-acc4.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u64.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u72-acc3.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u72.c.o [ 58%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qhardswish.cpp.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u80-acc2.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u80-acc5.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u80.c.o [ 58%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u96-acc2.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qlinear.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u96-acc3.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u96-acc6.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx2-rr1-p5-u96.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u8.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u16.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u24.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u32.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u40.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qlinear_prepack.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u48.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u56.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u64.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u72.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut4-p4-perm-u80.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u8.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u16.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u24.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u32.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u40.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u48.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u56.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qmatmul.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u64.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u72.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qmul.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut8-p4-perm-u80.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u8.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u16.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u24.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qnormalization.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u32.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u40.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u48.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u56.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u64.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u72.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-lut16-p3-gather-u80.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u8.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u16.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u24.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qrelu.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u32.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u40.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qsigmoid.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u48.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u56.c.o [ 59%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qsoftmax.cpp.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u64.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u72.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx2-rr1-p6-u80.c.o [ 59%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u8.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u24.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u32.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u40.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u48.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u56.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qtanh.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u64.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u72.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qthreshold.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u80.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u88.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx2-p5-u96.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u8.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u24.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u32.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/library.cpp.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/qconv_unpack.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u40.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u48.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u56.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u64.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/qlinear_unpack.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u72.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u80.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u88.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx2-p5-u96.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u8.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u24.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u32.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkl/LinearAlgebra.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u40.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkl/SparseBlasImpl.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u48.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u56.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u64.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u72.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-div-u80.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkl/SparseCsrLinearAlgebra.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u8.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u24.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkl/SpectralOps.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u32.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u40.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u48.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u56.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/BinaryOps.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u64.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Conv.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u72.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr1fma-u80.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u8.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u24.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u32.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/ConvPrepack.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u40.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u48.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Copy.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u56.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u64.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u72.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx2-rr1-p5-nr2fma-u80.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Gelu.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u8.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u24.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u32.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/IDeepRegistration.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u40.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Linear.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u48.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u56.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/MKLDNNCommon.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u64.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u72.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div-u80.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u8.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u16.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u24.c.o [ 60%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Matmul.cpp.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u32.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u40.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u48.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u56.c.o [ 60%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u64.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u72.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/MkldnnTensorMath.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u8.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Normalization.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u16.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u24.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u32.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/OpContext.cpp.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Pooling.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u40.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Prelu.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u48.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u56.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u64.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u72.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-div-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u8.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u16.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/RNN.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u24.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u32.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u40.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u48.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u56.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/RegisterMkldnnOpContextClass.cpp.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Relu.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u64.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/SoftMax.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u72.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u8.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/TensorFactories.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u16.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/TensorShape.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u24.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u32.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u40.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/UnaryOps.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u48.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mkldnn/Utils.cpp.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/transformers/attention.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u56.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u64.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u72.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/transformers/sdp_utils_cpp.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-div-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u8.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u16.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u24.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u32.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/transformers/transformer.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u40.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u48.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u56.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u64.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/utils/Factory.cpp.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/Activation.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u72.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u8.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u16.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/AveragePooling.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u24.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u32.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/ChannelShuffle.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u40.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/Convolution.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u48.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u56.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/Init.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u64.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/Linear.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u72.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/MaxPooling.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-div-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u8.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u16.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u24.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u32.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/OpContext.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u40.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u48.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u56.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/RegisterOpContextClass.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u64.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/xnnpack/Shim.cpp.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/CompositeViewCopyKernels.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u72.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1-u80.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u8.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u16.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u24.c.o [ 61%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Functions.cpp.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u32.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u40.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u48.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u56.c.o [ 61%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u64.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u72.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx2-expm1minus-rr1-p6h5ts-nr1adj-u80.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-expm1minus-avx2-rr1-p2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-expm1minus-avx2-rr1-p3.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-expminus-avx2-rr1-p2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-expminus-avx2-rr1-p3.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-sigmoid-avx2-rr1-p2-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-sigmoid-avx2-rr1-p2-rcp.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-sigmoid-avx2-rr1-p3-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f16-sigmoid-avx2-rr1-p3-rcp.c.o [ 62%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Operators_0.cpp.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx2-rr2-lut8-p3-perm.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx2-rr2-lut8-p4-perm.c.o [ 62%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Operators_1.cpp.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx2-rr2-p5.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx2-rr1-lut4-p4-perm.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx2-rr1-lut8-p4-perm.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx2-rr1-lut16-p3-gather.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx2-rr1-p6.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expminus-avx2-rr1-p5.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expminus-avx2-rr2-p5.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-extexp-avx2-p5.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-lut64-p2-gather-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-lut64-p2-gather-nr1fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-lut64-p2-gather-nr2fma1adj.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-lut64-p2-gather-nr2fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-p5-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-p5-nr1fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr1-p5-nr2fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-lut64-p2-gather-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-lut64-p2-gather-nr1fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-lut64-p2-gather-nr2fma1adj.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-lut64-p2-gather-nr2fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-p5-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-p5-nr1fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx2-rr2-p5-nr2fma.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-avx2-expm1minus-rr1-p3h2ts-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-avx2-expm1minus-rr1-p3h2ts-rcp.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut8-p4h3ps-gather-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut8-p4h3ps-gather-nr1.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut8-p4h3ps-gather-nr1adj.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut8-p4h3ps-perm-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut8-p4h3ps-perm-nr1.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-lut8-p4h3ps-perm-nr1adj.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-p6h5ts-div.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-p6h5ts-nr1.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx2-expm1minus-rr1-p6h5ts-nr1adj.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-1x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-2x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-3x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-4x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-1x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-2x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-3x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-4x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-1x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-2x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-3x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-4x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x8c8-minmax-avx2.c.o [ 62%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x8c8-minmax-avx2.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l32c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l32c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l32c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l32c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l32c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l32c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Operators_2.cpp.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l32c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l32c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l32c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p8c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p32c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p32c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p32c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p32c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p8c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p32c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p32c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p32c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p32c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f16-vcvt/gen/qs8-f16-vcvt-avx2-u16.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f16-vcvt/gen/qs8-f16-vcvt-avx2-u24.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f16-vcvt/gen/qs8-f16-vcvt-avx2-u32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f16-vcvt/gen/qs8-f16-vcvt-avx2-u64.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx2-u8.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx2-u16.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx2-u24.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx2-u32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p16c-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l32c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l32c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l32c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 63%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l32c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l32c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l32c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l32c16s16r-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l32c16s16r-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l32c16s16r-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p8c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p32c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p32c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p32c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p32c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p8c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p32c-minmax-fp32-avx2-mul16-add16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p32c-minmax-fp32-avx2-mul16-vpmovsx.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p32c-minmax-fp32-avx2-mul16-vpunpck.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p32c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x8c8-minmax-fp32-avx2.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x8c8-minmax-fp32-avx2.c.o [ 64%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Operators_3.cpp.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx2-mul32-ld64-u8.c.o [ 64%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Operators_4.cpp.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx2-mul32-ld64-u16.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx2-mul32-ld64-u24.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx2-mul32-ld64-u32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx2-mul32-ld64-u8.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx2-mul32-ld64-u16.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx2-mul32-ld64-u24.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx2-mul32-ld64-u32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-avx2-u16.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-avx2-u32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vcvt/gen/qs8-vcvt-avx2-u64.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-avx2-u16.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-avx2-u32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vlrelu/gen/qs8-vlrelu-avx2-u64.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l8c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l32c8s8r-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p8c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p32c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p8c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p32c-minmax-fp32-avx2-mul32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx2-u8.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx2-u16.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx2-u24.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx2-u32.c.o [ 64%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x8c8-minmax-fp32-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx2-mul32-ld64-u8.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx2-mul32-ld64-u16.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx2-mul32-ld64-u8.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx2-mul32-ld64-u16.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-avx2-u16.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-avx2-u32.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vcvt/gen/qu8-vcvt-avx2-u64.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-avx2-u16.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-avx2-u32.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vlrelu/gen/qu8-vlrelu-avx2-u64.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx2-u32.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx2-u64.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx2-u96.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx2-u128.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-32x32-reuse-mov-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-transposec/gen/x8-transposec-32x32-reuse-switch-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-packw/gen/x16-packw-x8-gemm-goi-avx2-u16-prfm.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-packw/gen/x16-packw-x8-gemm-goi-avx2-u16.c.o [ 65%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterBackendSelect.cpp.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-packw/gen/x16-packw-x16-gemm-goi-avx2-u16-prfm.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-packw/gen/x16-packw-x16-gemm-goi-avx2-u16.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-16x16-reuse-mov-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x16-transposec/gen/x16-transposec-16x16-reuse-switch-avx2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p16c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p16c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p32c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-3p32c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p16c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p16c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p32c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-4p32c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c16s1r-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l16c16s1r-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l32c16s1r-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-5f5m5l32c16s1r-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p16c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p16c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p32c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-9p32c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p16c-minmax-avx512f-acc2.c.o [ 65%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCPU.cpp.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p16c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p32c-minmax-avx512f-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-dwconv/gen/f32-dwconv-25p32c-minmax-avx512f.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-1x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-4x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-5x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-6x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-7x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemm/gen/f32-gemm-8x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-1x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-4x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-5x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-6x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-7x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-gemminc/gen/f32-gemminc-8x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-1x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-4x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-5x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-6x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-7x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-igemm/gen/f32-igemm-8x16-minmax-avx512f-broadcast.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-avx512f-2x16.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-prelu/gen/f32-prelu-avx512f-2x32.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u128-acc2.c.o [ 65%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u128-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u128.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u144-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u144.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u160-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u160-acc5.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u160.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u192-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u192-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u192-acc6.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddexpminusmax/gen/f32-raddexpminusmax-avx512f-p5-scalef-u192.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u128-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u128-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u128.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u144-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u144.c.o [ 66%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeExplicitAutograd.cpp.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u160-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u160-acc5.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u160.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u192-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u192-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u192-acc6.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddextexp/gen/f32-raddextexp-avx512f-p5-scalef-u192.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u128-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u128-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u128.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u144-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u144.c.o [ 66%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeExplicitAutogradNonFunctional.cpp.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u160-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u160-acc5.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u160.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u192-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u192-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u192-acc6.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-raddstoreexpminusmax/gen/f32-raddstoreexpminusmax-avx512f-rr1-p5-scalef-u192.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx512f-u32-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx512f-u48-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx512f-u64-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmax-avx512f-u64-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx512f-u32-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx512f-u48-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx512f-u64-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rmin-avx512f-u64-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx512f-u32-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx512f-u48-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx512f-u64-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rminmax/gen/f32-rminmax-avx512f-u64-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx512f-u32-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx512f-u48-acc3.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx512f-u64-acc2.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-rsum/gen/f32-rsum-avx512f-u64-acc4.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vadd-minmax-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vaddc-minmax-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdiv-minmax-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vdivc-minmax-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmax-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmaxc-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmin-avx512f-u32.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-avx512f-u16.c.o [ 66%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vminc-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmul-minmax-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vmulc-minmax-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrdivc-minmax-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vrsubc-minmax-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiff-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsqrdiffc-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsub-minmax-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vbinary/gen/f32-vsubc-minmax-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vclamp/gen/f32-vclamp-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u48.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u64.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u80.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u96.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u112.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-lut16-p3-perm-u128.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u48.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u64.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u80.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u96.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u112.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-velu/gen/f32-velu-avx512f-rr1-p6-u128.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vhswish/gen/f32-vhswish-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vlrelu/gen/f32-vlrelu-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrelu/gen/f32-vrelu-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndd-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndne-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndu-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-avx512f-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrnd/gen/f32-vrndz-avx512f-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-avx512f-rsqrt-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-avx512f-rsqrt-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vrsqrt/gen/f32-vrsqrt-avx512f-rsqrt-u64.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u16.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u48.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u64.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u80.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u96.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u112.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u128.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u144.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u160.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u176.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleexpminusmax/gen/f32-vscaleexpminusmax-avx512f-p5-scalef-u192.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u16.c.o [ 67%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeImplicitAutograd.cpp.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u32.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u48.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u64.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u80.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u96.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u112.c.o [ 67%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u144.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u160.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u176.c.o [ 68%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeImplicitAutogradNestedTensor.cpp.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vscaleextexp/gen/f32-vscaleextexp-avx512f-p5-scalef-u192.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u48.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u64.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u80.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u96.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u112.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-div-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u48.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u64.c.o [ 68%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterFunctionalization_0.cpp.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u80.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u96.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u112.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u48.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u64.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u80.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u96.c.o [ 68%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterFunctionalization_1.cpp.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u112.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-div-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u48.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u64.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u80.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u96.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u112.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr1-p5-scalef-nr1fma-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u48.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u64.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u80.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u96.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u112.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u48.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u64.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u80.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u96.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u112.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsigmoid/gen/f32-vsigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma-u128.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-avx512f-nr1fma1adj-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-avx512f-nr1fma1adj-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vsqrt/gen/f32-vsqrt-avx512f-nr1fma1adj-u64.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-avx512f-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vabs-avx512f-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-avx512f-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vneg-avx512f-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-avx512f-u16.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vunary/gen/f32-vsqr-avx512f-u32.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx512f-rr2-lut16-p3-perm-scalef.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx512f-rr2-lut16-p3-perm.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx512f-rr2-lut32-p2-perm2-scalef.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx512f-rr2-lut32-p2-perm2.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx512f-rr2-p5-scalef.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-exp-avx512f-rr2-p5.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx512f-rr1-lut16-p3-perm.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-expm1minus-avx512f-rr1-p6.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-extexp-avx512f-p5.c.o [ 68%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut16-p3-perm-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut16-p3-perm-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut32-p2-perm2-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut32-p2-perm2-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut32-p2-perm2-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut64-p2-gather-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut64-p2-gather-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-lut64-p2-gather-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-p5-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-p5-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr1-p5-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut16-p3-perm-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut16-p3-perm-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut16-p3-perm-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut32-p2-perm2-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut32-p2-perm2-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut64-p2-gather-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut64-p2-gather-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-lut64-p2-gather-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-p5-scalef-div.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-p5-scalef-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sigmoid-avx512f-rr2-p5-scalef-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-avx512f-nr1fma1adj.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-avx512f-nr1fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/f32-sqrt-avx512f-nr2fma.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-avx512f-u4-prfm.c.o [ 69%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterFunctionalization_2.cpp.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x32-packw/gen/x32-packw-x16-gemm-goi-avx512f-u4.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx512skx-u16.c.o [ 69%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f16-f32-vcvt/gen/f16-f32-vcvt-avx512skx-u32.c.o [ 70%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterFunctionalization_3.cpp.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-avx512skx-u16.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-f16-vcvt/gen/f32-f16-vcvt-avx512skx-u32.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-1x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-2x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-3x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-4x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-5x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-6x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-7x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc4w-gemm-8x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-1x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-2x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-3x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-4x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-5x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-6x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-7x16-minmax-avx512skx-broadcast.c.o [ 70%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterMeta.cpp.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-7x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-8x16-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qc8w-gemm/gen/f32-qc8w-gemm-8x32-minmax-avx512skx-broadcast.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx512skx-u32.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx512skx-u64.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx512skx-u96.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qs8-vcvt/gen/f32-qs8-vcvt-avx512skx-u128.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx512skx-u32.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx512skx-u64.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx512skx-u96.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-qu8-vcvt/gen/f32-qu8-vcvt-avx512skx-u128.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u16.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u32.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u48.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u64.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u80.c.o [ 70%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u128.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u144.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u128.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u144.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u128.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u144.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-div-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u128.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u144.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-gather-nr1adj-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u128.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u144.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-div-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u128.c.o [ 71%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterMkldnnCPU.cpp.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u144.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-lut8-p4h3ts-perm-nr1adj-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u112.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u128.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u144.c.o [ 71%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterNestedTensorCPU.cpp.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-div-u160.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u16.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u32.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u48.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u64.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u80.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u96.c.o [ 71%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u112.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u128.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u144.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/f32-vtanh/gen/f32-vtanh-avx512skx-expm1minus-rr1-p6h5ts-nr1-u160.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-div.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut4-p4h3ts-perm-nr1adj.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut8-p4h3ps-gather-div.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut8-p4h3ps-gather-nr1.c.o [ 72%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterNestedTensorMeta.cpp.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut8-p4h3ps-gather-nr1adj.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut8-p4h3ps-perm-div.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut8-p4h3ps-perm-nr1.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-lut8-p4h3ps-perm-nr1adj.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-p6h5ts-div.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-p6h5ts-nr1.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/math/gen/f32-tanh-avx512skx-expm1minus-rr1-p6h5ts-nr1adj.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-1x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-2x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-3x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-4x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-5x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-6x8c8-minmax-avx512skx.c.o [ 72%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterQuantizedCPU.cpp.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-7x8c8-minmax-avx512skx.c.o [ 72%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterQuantizedMeta.cpp.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-8x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-1x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-2x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-3x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-4x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-5x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-6x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-7x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-8x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-1x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-2x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-3x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-4x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-5x8c8-minmax-avx512skx.c.o [ 72%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterSchema.cpp.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-6x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-7x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-8x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x16c8-minmax-avx512skx-prfm.c.o [ 72%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterSparseCPU.cpp.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x16c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x8c8-minmax-avx512skx.c.o [ 72%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x8c8-minmax-avx512skx.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterSparseCsrCPU.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x16c8-minmax-avx512skx.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterSparseCsrMeta.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x8c8-minmax-avx512skx.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterSparseMeta.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x16c8-minmax-avx512skx.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterZeroTensor.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x8c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x16c8-minmax-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x16c8-minmax-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/UfuncCPU_add.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-5f5m5l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-6f6m7l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/ATenOpList.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-8f8m9l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p16c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-9p32c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/core/TensorMethods.cpp.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/quantized/QTensorImpl.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p16c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-dwconv/gen/qs8-dwconv-25p32c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx512skx-u16.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx512skx-u32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx512skx-u48.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-f32-vcvt/gen/qs8-f32-vcvt-avx512skx-u64.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/quantized/Quantizer.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-3p32c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-5f5m5l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/nnapi/nnapi_bind.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-6f6m7l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-8f8m9l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p16c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-9p32c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p16c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-dwconv/gen/qs8-qc8w-dwconv-25p32c-minmax-fp32-avx512skx-mul32.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x8c8-minmax-fp32-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x16c8-minmax-fp32-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x16c8-minmax-fp32-avx512skx.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/nnapi/nnapi_model_loader.cpp.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x8c8-minmax-fp32-avx512skx.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x16c8-minmax-fp32-avx512skx-prfm.c.o [ 73%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x16c8-minmax-fp32-avx512skx.c.o [ 73%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/nnapi/nnapi_register.cpp.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/nnapi/nnapi_wrapper.cpp.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/UfuncCPUKernel_add.cpp.DEFAULT.cpp.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx512skx-mul32-ld128-u16.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vadd/gen/qs8-vadd-minmax-avx512skx-mul32-ld128-u32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx512skx-mul32-ld128-u16.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-vaddc/gen/qs8-vaddc-minmax-avx512skx-mul32-ld128-u32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp.DEFAULT.cpp.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-5f5m5l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-6f6m7l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l16c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-8f8m9l32c16s1r-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p16c-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/spherical_bessel_j0.cpp.DEFAULT.cpp.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-9p32c-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p16c-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-dwconv/gen/qu8-dwconv-25p32c-minmax-fp32-avx512skx-mul32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx512skx-u16.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx512skx-u32.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx512skx-u48.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-f32-vcvt/gen/qu8-f32-vcvt-avx512skx-u64.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-1x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-2x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/scaled_modified_bessel_k1.cpp.DEFAULT.cpp.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-3x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x8c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-4x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-5x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-5x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-6x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-6x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-7x16c8-minmax-fp32-avx512skx-prfm.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-7x16c8-minmax-fp32-avx512skx.c.o [ 74%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-8x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-gemm/gen/qu8-gemm-8x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/scaled_modified_bessel_k0.cpp.DEFAULT.cpp.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x8c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-1x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x8c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-2x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x8c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-3x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x8c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-4x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-5x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-5x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/layer_norm_kernel.cpp.DEFAULT.cpp.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-6x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-6x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-7x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-7x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-8x16c8-minmax-fp32-avx512skx-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-igemm/gen/qu8-igemm-8x16c8-minmax-fp32-avx512skx.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx512skx-mul32-ld128-u16.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vadd/gen/qu8-vadd-minmax-avx512skx-mul32-ld128-u32.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx512skx-mul32-ld128-u16.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qu8-vaddc/gen/qu8-vaddc-minmax-avx512skx-mul32-ld128-u32.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512skx-vpshufb-u64.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512skx-vpshufb-u128.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512skx-vpshufb-u192.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512skx-vpshufb-u256.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512vbmi-vpermx2b-u64.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512vbmi-vpermx2b-u128.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512vbmi-vpermx2b-u192.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/x8-lut/gen/x8-lut-avx512vbmi-vpermx2b-u256.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-1x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-1x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-2x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-2x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-3x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-3x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-4x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-4x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-5x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-5x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-6x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-6x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-7x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-7x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-8x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-8x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-1x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-1x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-2x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-2x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-3x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-3x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-4x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-4x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-5x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-5x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-6x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-6x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-7x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-7x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-8x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-gemm/gen/qd8-f16-qc8w-gemm-8x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-1x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-1x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-2x8c8-minmax-avx512vnni-prfm.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-2x8c8-minmax-avx512vnni.c.o [ 75%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-3x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-3x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-4x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-4x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-5x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-5x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-6x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-6x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-7x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/int8mm_kernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-7x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-8x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc8w-igemm/gen/qd8-f16-qc8w-igemm-8x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/int4mm_kernel.cpp.DEFAULT.cpp.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/group_norm_kernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c4-minmax-avx512vnni.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/batch_norm_kernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/airy_ai.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c8-minmax-avx512vnni.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/WeightNormKernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c8-minmax-avx512vnni.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UpSampleMoreKernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x16c4-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-1x16c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x8c8-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x8c8-minmax-avx512vnni.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x16c4-minmax-avx512vnni-prfm.c.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x16c4-minmax-avx512vnni.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UpSampleKernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x16c8-minmax-avx512vnni-prfm.c.o [ 76%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UnfoldBackwardKernel.cpp.DEFAULT.cpp.o [ 76%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-2x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-3x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-4x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/Unfold2d.cpp.DEFAULT.cpp.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-5x16c8-minmax-avx512vnni.c.o [ 77%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp.DEFAULT.cpp.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-6x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x16c4-minmax-avx512vnni.c.o [ 77%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/TensorCompareKernel.cpp.DEFAULT.cpp.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-7x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-gemm/gen/qd8-f32-qc8w-gemm-8x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x16c4-minmax-avx512vnni.c.o [ 77%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SumKernel.cpp.DEFAULT.cpp.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-1x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-2x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-3x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-4x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-5x16c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x8c8-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x8c8-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x16c4-minmax-avx512vnni-prfm.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x16c4-minmax-avx512vnni.c.o [ 77%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x16c8-minmax-avx512vnni-prfm.c.o [ 77%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/StackKernel.cpp.DEFAULT.cpp.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-6x16c8-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x8c8-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x8c8-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x16c4-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x16c4-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x16c8-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-7x16c8-minmax-avx512vnni.c.o [ 78%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SpmmReduceKernel.cpp.DEFAULT.cpp.o [ 78%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SparseFactories.cpp.DEFAULT.cpp.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x8c8-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x8c8-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x16c4-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x16c4-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x16c8-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc8w-igemm/gen/qd8-f32-qc8w-igemm-8x16c8-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SortingKernel.cpp.DEFAULT.cpp.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-1x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-2x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-3x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-4x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-5x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-6x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SoftMaxKernel.cpp.DEFAULT.cpp.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-7x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x16c4-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x16c4-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x16c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-gemm/gen/qs8-qc8w-gemm-8x16c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x16c4-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x16c4-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x16c8-minmax-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-1x16c8-minmax-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x8c8-minmax-fp32-avx512vnni.c.o [ 78%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x16c4-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-2x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x8c8-minmax-fp32-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x16c4-minmax-avx512vnni.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ScatterGatherKernel.cpp.DEFAULT.cpp.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-3x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x8c8-minmax-fp32-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x16c4-minmax-avx512vnni.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SampledAddmmKernel.cpp.DEFAULT.cpp.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-4x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x8c8-minmax-fp32-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x16c4-minmax-avx512vnni.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/RenormKernel.cpp.DEFAULT.cpp.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-5x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x8c8-minmax-fp32-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x16c4-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ReduceOpsKernel.cpp.DEFAULT.cpp.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-6x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ReduceAllOpsKernel.cpp.DEFAULT.cpp.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x8c8-minmax-fp32-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x16c4-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-7x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x8c8-minmax-fp32-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x8c8-minmax-fp32-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x16c4-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x16c4-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x16c8-minmax-avx512vnni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qs8-qc8w-igemm/gen/qs8-qc8w-igemm-8x16c8-minmax-avx512vnni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-1x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-1x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-2x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-2x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-3x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-3x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-4x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-4x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-5x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-5x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-6x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-6x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-7x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-7x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-8x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f16-qc4w-gemm/gen/qd8-f16-qc4w-gemm-8x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/RangeFactoriesKernel.cpp.DEFAULT.cpp.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c4-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c4-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-1x16c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c4-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c4-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-2x16c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x8c8-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x8c8-minmax-avx512vnnigfni.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c4-minmax-avx512vnnigfni-prfm.c.o [ 79%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c4-minmax-avx512vnnigfni.c.o [ 79%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PowKernel.cpp.DEFAULT.cpp.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PointwiseOpsKernel.cpp.DEFAULT.cpp.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-3x16c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x8c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x8c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c4-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c4-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-4x16c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x8c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x8c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c4-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c4-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-5x16c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x8c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x8c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c4-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c4-minmax-avx512vnnigfni.c.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PixelShuffleKernel.cpp.DEFAULT.cpp.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-6x16c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x8c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x8c8-minmax-avx512vnnigfni.c.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PaddingKernel.cpp.DEFAULT.cpp.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c4-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c4-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-7x16c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x8c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x8c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c4-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c4-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c8-minmax-avx512vnnigfni-prfm.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/qd8-f32-qc4w-gemm/gen/qd8-f32-qc4w-gemm-8x16c8-minmax-avx512vnnigfni.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2-k-over-64.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2-k-over-2048.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2minus-k-over-4.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2minus-k-over-8.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2minus-k-over-16.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2minus-k-over-32.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2minus-k-over-64.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/exp2minus-k-over-2048.c.o [ 80%] Building C object confu-deps/XNNPACK/CMakeFiles/microkernels-all.dir/src/tables/vlog.c.o [ 80%] Built target microkernels-all [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/NativeMultiheadAttnKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MultinomialKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxUnpoolKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxPooling.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxPoolKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/LinearAlgebraKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/LerpKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/IndexKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/HistogramKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FunctionOfAMatrixUtilsKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FlashAttentionKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FillKernel.cpp.DEFAULT.cpp.o [ 80%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DistributionKernels.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DistanceOpsKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DepthwiseConvKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CrossKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ComplexKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ChannelShuffleKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CatKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/BlasKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/BinaryOpsKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AvgPoolKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AmpGradScalerKernels.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AdaptiveMaxPoolKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AdaptiveAvgPoolKernel.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/Activation.cpp.DEFAULT.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/vulkan/Context.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/metal/Context.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o [ 81%] Building C object caffe2/CMakeFiles/torch_cpu.dir/__/third_party/miniz-2.1.0/miniz.c.o /builddir/build/BUILD/pytorch/third_party/miniz-2.1.0/miniz.c:3157:9: note: '#pragma message: Using fopen, ftello, fseeko, stat() etc. path for file I/O - this path may not support large files.' 3157 | #pragma message("Using fopen, ftello, fseeko, stat() etc. path for file I/O - this path may not support large files.") | ^~~~~~~ [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/serialize/istream_adapter.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/serialize/crc.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/serialize/read_adapter_interface.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/utils/string_utils.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/pthreadpool-cpp.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/thread_pool_guard.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/utils/proto_wrap.cc.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/Functions.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/ViewFuncs.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/VariableType_0.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/VariableType_1.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/VariableType_2.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/VariableType_3.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/VariableType_4.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_0.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_1.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_2.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_3.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_4.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/ADInplaceOrViewType_0.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/inductor/aoti_torch/generated/c_shim_cpu.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/generated/LazyNativeFunctions.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/generated/RegisterAutogradLazy.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/generated/RegisterLazy.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/anomaly_mode.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/autograd.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/autograd_meta.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/autograd_not_implemented_fallback.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/cpp_hook.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/custom_function.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/engine.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/forward_grad.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/function.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/functions/accumulate_grad.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/functions/basic_ops.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/functions/tensor.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/functions/utils.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/input_buffer.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/input_metadata.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/jit_decomp_interface.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/profiler_kineto.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/profiler_legacy.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/record_function_ops.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/saved_variable.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/utils/warnings.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/variable.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/variable_info.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/inductor/aoti_runner/model_container_runner.cpp.o [ 81%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/inductor/aoti_runner/model_container_runner_cpu.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/inductor/aoti_torch/shim_common.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/inductor/aoti_torch/tensor_converter.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/inductor/inductor_ops.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/api/function_impl.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/api/module.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/api/object.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/backends/backend_debug_handler.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/backends/backend_debug_info.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/backends/backend_detail.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/backends/backend_interface.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/backends/backend_resolver.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/codegen.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/compiler.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/executor.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/fallback.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/interface.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/kernel_cache.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/builtin_functions.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/canonicalize_modified_loop.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/convert_to_ssa.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/edit_distance.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/exit_transforms.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/inline_loop_condition.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/ir_emitter.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/name_mangler.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/parser.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/schema_matching.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/script_type_parser.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/sugared_value.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/tracer.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/frontend/versioned_symbols.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/alias_analysis.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/attributes.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/constants.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/graph_utils.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/ir.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/irparser.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/node_hashing.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/scope.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/subgraph_matcher.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/type_hashing.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/jit_log.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/jit_opt_limit.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/compatibility/model_compatibility.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/compatibility/runtime_compatibility.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/flatbuffer_loader.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/function.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/import.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/interpreter.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/module.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/nnc/aot_compiler.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/nnc/backend.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/nnc/context.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/nnc/registry.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/observer.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/parse_bytecode.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/parse_operators.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/prim_ops_registery.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/promoted_prim_ops.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/quantization.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/register_ops_common_utils.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/type_parser.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/upgrader_mobile.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/operator_upgraders/upgraders.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/operator_upgraders/upgraders_entry.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/operator_upgraders/utils.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/operator_upgraders/version_map.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/add_if_then_else.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/annotate_warns.cpp.o [ 82%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/bailout_graph.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/batch_mm.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/canonicalize.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/canonicalize_graph_fuser_ops.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/check_strict_fusion.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/clear_profiling.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/clear_undefinedness.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/common_subexpression_elimination.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/concat_opt.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/constant_pooling.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/constant_propagation.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/create_autodiff_subgraphs.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/create_functional_graphs.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/dbr_quantization/remove_redundant_aliases.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/dead_code_elimination.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/decompose_ops.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/device_type_analysis.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/dtype_analysis.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/eliminate_no_ops.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/erase_number_types.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/fixup_trace_scope_blocks.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/fold_conv_bn.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/fold_linear_bn.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/freeze_module.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_concat_linear.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_conv_add_relu_fusion.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_conv_folding.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_graph_optimizations.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_linear_folding.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_linear_transpose.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/fuse_linear.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/fuse_relu.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/graph_fuser.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/graph_rewrite_helper.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/guard_elimination.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/hoist_conv_packed_params.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/inline_autodiff_subgraphs.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/inline_fork_wait.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/inline_forked_closures.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/inliner.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/inplace_check.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/insert_guards.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/integer_value_refinement.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/lift_closures.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/liveness.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/loop_unrolling.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/lower_grad_of.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/lower_tuples.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/metal_rewrite.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/mkldnn_rewrite.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/normalize_ops.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/pass_manager.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/peephole.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/peephole_alias_sensitive.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/peephole_dict_idioms.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/peephole_list_idioms.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/peephole_non_tensor.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/prepack_folding.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/dedup_module_uses.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/finalize.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/fusion_passes.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/helper.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/insert_observers.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/quantization_type.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/quantization/register_packed_params.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/refine_tuple_types.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/remove_dropout.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/remove_exceptions.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/remove_expands.cpp.o [ 83%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/remove_mutation.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/remove_redundant_profiles.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/replacement_of_old_operators.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/requires_grad_analysis.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/restore_mutation.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/shape_analysis.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/specialize_autogradzero.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/subgraph_rewrite.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/symbolic_shape_analysis.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/symbolic_shape_cache.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/symbolic_shape_runtime_fusion.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/tensorexpr_fuser.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/update_differentiable_graph_requires_grad.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/utils/memory_dag.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/utils/op_registry.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/utils/optimization_utils.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/utils/subgraph_utils.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/value_refinement_utils.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/variadic_ops.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/vulkan_rewrite.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/xnnpack_rewrite.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/python/update_graph_executor_opt.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/python/utf8_decoding_ignore.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/argument_spec.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/autodiff.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/decomposition_registry.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/decomposition_registry_util.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/graph_executor.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/instruction.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/interpreter.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/interpreter/frame.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/interpreter/preprocess_graph.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/jit_exception.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/jit_trace.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/logging.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/operator.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/print_handler.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/profiling_graph_executor_impl.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/profiling_record.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/register_ops_utils.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/script_profile.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/serialized_shape_function_registry.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/simple_graph_executor_impl.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/slice_indices_adjust.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/fusion.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/generated_ops.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/memory_planner.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/native_ops.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/ops.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/passes.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/te_wrapper.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/symbolic_script.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/symbolic_shape_registry.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/symbolic_shape_registry_util.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/vararg_functions.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/callstack_debug_info_serialization.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/import.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/import_export_helpers.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/import_read.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/import_source.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/pickle.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/pickler.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/python_print.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/source_range_serialization.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/type_name_uniquer.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/unpickler.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/block_codegen.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/bounds_inference.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/bounds_overlap.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/codegen.cpp.o [ 84%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/cpp_codegen.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/eval.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/expr.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/external_functions.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/external_functions_codegen.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/external_functions_core.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/external_functions_registry.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/graph_opt.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/hash_provider.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/intrinsic_symbols.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir_cloner.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir_mutator.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir_printer.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir_simplifier.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir_verifier.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/ir_visitor.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/kernel.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/llvm_codegen.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/llvm_jit.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/loopnest.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/loopnest_randomization.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/lowerings.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/mem_dependency_checker.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/conv2d.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/matmul.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/misc.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/norm.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/pointwise.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/quantization.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/reduction.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/operators/softmax.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/reduction.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/registerizer.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/tensor.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/types.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/tensorexpr/unique_name_manager.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/testing/file_check.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/testing/hooks_for_testing.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/backend/backend_device.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/backend/backend_interface.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/backend/lowering_context.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/config.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/debug_util.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/hash.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/helpers.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/ir.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/ir_dump_util.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/ir_metadata.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/ir_util.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/lazy_graph_executor.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/metrics.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/multi_wait.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/ops/arithmetic_ir_ops.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/ops/utils.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/permutation_util.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/shape.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/shape_inference.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/tensor.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/tensor_impl.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/tensor_util.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/thread_pool.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/core/trie.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/monitor/counters.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/monitor/events.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/collection.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/combined_traceback.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/data_flow.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/kineto_client_interface.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/kineto_shim.cpp.o [ 85%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/orchestration/observer.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/orchestration/python_tracer.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/orchestration/vulkan.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/perf.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/standalone/execution_trace_observer.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/standalone/itt_observer.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/standalone/nvtx_observer.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/stubs/base.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/unwind/unwind.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/unwind/unwind_fb.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/util.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/utils/cpp_stacktraces.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/utils/schema_info.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/utils/tensor_flatten.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/utils/variadic.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/cuda/interface.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/autocast.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/lower_graph.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/remove_inplace_ops.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/passes/utils/check_alias_annotation.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/register_c10_ops.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/register_prim_ops.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/register_prim_ops_fulljit.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/register_special_ops.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/debug_info.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/dynamic_ir.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/config.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ops/device_data.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ops/generic.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/tensor_aten_ops.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_autograd_functions.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_backend_impl.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_eager_fallback.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_lowering_context.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_native_functions.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_node.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/lazy/ts_backend/ts_node_lowering.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/import_data.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/train/export_data.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/train/optim/sgd.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/train/random.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/train/sequential.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/flatbuffer_serializer.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/FunctionsManual.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/utils/out_types.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/TraceTypeManual.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/VariableTypeManual.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/itt_wrapper.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/stubs/itt.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/jit.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/compatibility/backport.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/mobile/compatibility/backport_manager.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/onnx.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/export.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/export_bytecode.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/serialization/export_module.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/codegen/fuser/cpu/fused_kernel.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/api/module_save.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/utils/byte_order.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/Backend.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/FileStore.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/Functional.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/GlooDeviceFactory.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/GroupRegistry.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/Ops.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/ParamCommsUtils.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/PrefixStore.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/ProcessGroup.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/ProcessGroupGloo.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/ProcessGroupMPI.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/ProcessGroupWrapper.cpp.o [ 86%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/Store.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/TCPStore.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/TCPStoreBackend.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/TCPStoreLibUvBackend.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/Utils.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/comm.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/debug.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/default_comm_hooks.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/logger.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/logging.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/quantization/quantization.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/reducer.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/sequence_num.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/socket.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/Work.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/autograd.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/utils.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/context/container.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/context/context.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/engine/dist_engine.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/functions/recvrpc_backward.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/functions/sendrpc_backward.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/autograd_metadata.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/propagate_gradients_req.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/propagate_gradients_resp.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/cleanup_autograd_context_req.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/cleanup_autograd_context_resp.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/rpc_with_autograd.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/rpc_with_profiling_req.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/rpc_with_profiling_resp.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/rref_backward_req.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/autograd/rpc_messages/rref_backward_resp.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/HashStore.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/c10d/ProcessGroupRoundRobin.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/agent_utils.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/message.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/profiler/remote_profiler_manager.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/profiler/server_process_global_profiler.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/python_call.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/python_remote_call.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/python_resp.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/request_callback.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/request_callback_no_python.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/rpc_agent.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/rref_context.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/rref_impl.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/rref_proto.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/script_call.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/script_remote_call.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/script_resp.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/tensorpipe_agent.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/tensorpipe_utils.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/testing/faulty_tensorpipe_agent.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/torchscript_functions.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/types.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/distributed/rpc/utils.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/cuda.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/data/datasets/mnist.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/data/samplers/distributed.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/data/samplers/random.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/data/samplers/sequential.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/data/samplers/stream.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/enum.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/imethod.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/serialize.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/mps.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/init.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/module.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/_functions.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/activation.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/adaptive.cpp.o [ 87%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/batchnorm.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/normalization.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/instancenorm.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/conv.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/dropout.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/distance.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/embedding.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/fold.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/linear.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/loss.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/padding.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/pixelshuffle.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/pooling.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/rnn.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/upsampling.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/transformer.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/modules/container/functional.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/activation.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/adaptive.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/batchnorm.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/embedding.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/instancenorm.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/normalization.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/conv.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/dropout.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/linear.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/padding.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/pooling.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/rnn.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/vision.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/nn/options/transformer.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/adagrad.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/adam.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/adamw.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/lbfgs.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/optimizer.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/rmsprop.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/serialize.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/sgd.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/schedulers/lr_scheduler.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/schedulers/step_lr.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/optim/schedulers/reduce_on_plateau_scheduler.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/serialize/input-archive.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/serialize/output-archive.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/api/src/xpu.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/UfuncCPUKernel_add.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/spherical_bessel_j0.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/scaled_modified_bessel_k1.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/scaled_modified_bessel_k0.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/layer_norm_kernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/int8mm_kernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/int4mm_kernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/group_norm_kernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/batch_norm_kernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/airy_ai.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/WeightNormKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UpSampleMoreKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UpSampleKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UnfoldBackwardKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/Unfold2d.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/TensorCompareKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SumKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/StackKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SpmmReduceKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SparseFactories.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SortingKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SoftMaxKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ScatterGatherKernel.cpp.AVX2.cpp.o [ 88%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SampledAddmmKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/RenormKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ReduceOpsKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ReduceAllOpsKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/RangeFactoriesKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PowKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PointwiseOpsKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PixelShuffleKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PaddingKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/NativeMultiheadAttnKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MultinomialKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxUnpoolKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxPooling.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxPoolKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/LinearAlgebraKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/LerpKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/IndexKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/HistogramKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FunctionOfAMatrixUtilsKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FlashAttentionKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FillKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DistributionKernels.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DistanceOpsKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DepthwiseConvKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CrossKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CopyKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ComplexKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ChannelShuffleKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CatKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/BlasKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/BinaryOpsKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AvgPoolKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AmpGradScalerKernels.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AdaptiveMaxPoolKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AdaptiveAvgPoolKernel.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/Activation.cpp.AVX2.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/UfuncCPUKernel_add.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/spherical_bessel_j0.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/scaled_modified_bessel_k1.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/scaled_modified_bessel_k0.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/layer_norm_kernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/int8mm_kernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/int4mm_kernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/group_norm_kernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/batch_norm_kernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/airy_ai.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/WeightNormKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UpSampleMoreKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UpSampleKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UnfoldBackwardKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/Unfold2d.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/TensorCompareKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SumKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/StackKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SpmmReduceKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SparseFactories.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SortingKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SoftMaxKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ScatterGatherKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/SampledAddmmKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/RenormKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ReduceOpsKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ReduceAllOpsKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/RangeFactoriesKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PowKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PointwiseOpsKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PixelShuffleKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/PaddingKernel.cpp.AVX512.cpp.o [ 89%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/NativeMultiheadAttnKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MultinomialKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxUnpoolKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxPooling.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/MaxPoolKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/LinearAlgebraKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/LerpKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/IndexKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/HistogramKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/GridSamplerKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FunctionOfAMatrixUtilsKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FlashAttentionKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/FillKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DistributionKernels.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DistanceOpsKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/DepthwiseConvKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CrossKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CopyKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ComplexKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/ChannelShuffleKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/CatKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/BlasKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/BinaryOpsKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AvgPoolKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AmpGradScalerKernels.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AdaptiveMaxPoolKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/AdaptiveAvgPoolKernel.cpp.AVX512.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/cpu/Activation.cpp.AVX512.cpp.o [ 90%] Linking CXX shared library ../lib/libtorch_cpu.so Warning: Unused direct dependencies: libc10.so.2.4 /lib64/libqnnpack.so.1 /lib64/libgloo_cuda.so.1 /lib64/liblmdb.so.0.0.0 /lib64/libleveldb.so.1 /lib64/libsnappy.so.1 /lib64/libzmq.so.5 /lib64/libhiredis.so.1.0.0 /lib64/libopencv_highgui.so.409 /lib64/libopencv_optflow.so.409 /lib64/libopencv_videoio.so.409 /lib64/libonnx_optimizer.so /lib64/libfoxi_loader.so.1 /lib64/libopencv_ximgproc.so.409 /lib64/libopencv_imgcodecs.so.409 /lib64/libopencv_video.so.409 /lib64/libopencv_dnn.so.409 /lib64/libopencv_calib3d.so.409 /lib64/libopencv_features2d.so.409 /lib64/libopencv_imgproc.so.409 /lib64/libopencv_flann.so.409 /lib64/libopencv_core.so.409 /lib64/libopencv_cudev.so.409 /usr/local/cuda-12.3/lib64/libcudart.so.12 [ 90%] Built target torch_cpu [ 90%] Building CXX object caffe2/torch/lib/libshm/CMakeFiles/shm.dir/core.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CUDAContext.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CUDAGraph.cpp.o [ 90%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CUDAGeneratorImpl.cpp.o [ 90%] Linking CXX shared library ../../../../lib/libshm.so Warning: Unused direct dependencies: libtorch_cpu.so.2.4 /lib64/libprotobuf.so.32 libc10.so.2.4 /lib64/libgflags.so.2.2 /lib64/libglog.so.0 /lib64/libqnnpack.so.1 /lib64/libgloo.so.1 /lib64/libgloo_cuda.so.1 /lib64/libm.so.6 [ 90%] Built target shm [ 90%] Building CXX object caffe2/torch/lib/libshm/CMakeFiles/torch_shm_manager.dir/manager.cpp.o [ 91%] Linking CXX executable ../../../../bin/torch_shm_manager [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CUDASparseDescriptors.cpp.o Warning: Unused direct dependencies: libshm.so.2.4 libc10.so.2.4 /lib64/libgflags.so.2.2 /lib64/libglog.so.0 /lib64/libm.so.6 [ 91%] Built target torch_shm_manager [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CachingHostAllocator.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CuSparseHandlePool.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/EmptyTensor.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/Exceptions.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/PeerToPeerAccess.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/PinnedMemoryAllocator.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/detail/CUDAHooks.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/detail/LazyNVRTC.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/llvm_basic.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/llvm_complex.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Resize.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SpectralOps.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorCompare.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/AffineGridGenerator.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/BatchNorm.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/ConvPlaceholders.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/ConvShared.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/Conv_v7.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/Conv_v8.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/GridSampler.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/LossCTC.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/MHA.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cudnn/RNN.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/miopen/BatchNorm_miopen.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/miopen/Conv_miopen.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/miopen/RNN_miopen.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/nested/cuda/NestedTensorTransformerUtils.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/Activation.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/BinaryOps.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/Conv.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/ConvPrepack.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/ConvUnpackImpl.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/Linear.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/LinearPrepack.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/LinearUnpackImpl.cpp.o [ 91%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cudnn/Pooling.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/cuSPARSELtOps.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/sdp_utils.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cudnn/AutocastRNN.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cudnn/Descriptors.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cudnn/Handle.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cudnn/Types.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/cuda/nccl.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/reducer_cuda.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/NCCLUtils.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/ProcessGroupUCC.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/UCCTracing.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/UCCUtils.cpp.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/intra_node_comm.cpp.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/intra_node_comm.cu.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/rpc/tensorpipe_cuda.cpp.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/distributed/c10d/quantization/quantization_gpu.cu.o [ 92%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/inductor/aoti_torch/generated/c_shim_cuda.cpp.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorFactories.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/Sleep.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/cub-RadixSortKeys.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/cub-RadixSortPairs.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/cub.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/detail/IndexUtils.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/jiterator.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AbsKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationEluKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationGeluKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationGluKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationHardshrinkKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationHardsigmoidKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationHardswishKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationHardtanhKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationLeakyReluKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationLogSigmoidKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationMishKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationPreluKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationSiluKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationSoftplusKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationSoftshrinkKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ActivationThresholdKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AdaptiveAveragePooling.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AdaptiveAveragePooling3d.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AdaptiveMaxPooling2d.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AdaptiveMaxPooling3d.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AmpKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AveragePool2d.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/AveragePool3d.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryBitwiseOpsKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryDivFloorKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryDivTrueKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryDivTruncKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryGeometricKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryLogicalOpsKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryMiscBackwardOpsKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryMiscOpsKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryMulKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryRemainderKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/BinaryShiftOpsKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Bucketization.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CUDAScalar.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Col2Im.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CompareEQKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CompareKernels.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ComplexKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ConvolutionMM2d.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Copy.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CopysignKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CrossKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CumminmaxKernel.cu.o [ 92%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CumprodKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/CumsumKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DepthwiseConv2d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DepthwiseConv3d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DilatedMaxPool2d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DilatedMaxPool3d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistanceKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionBernoulli.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionCauchyKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionExponentialKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionGeometricKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionLogNormalKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionNormal.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionRandomKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/DistributionUniform.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Distributions.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Dropout.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Embedding.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/EmbeddingBackwardKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/EmbeddingBag.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FillKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FlattenIndicesKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachBinaryOpScalar.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachBinaryOpScalarTensor.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachPointwiseOp.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachReduceOp.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachTernaryOp.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ForeachUnaryOp.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FractionalMaxPool2d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FractionalMaxPool3d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FunctionOfAMatrixUtilsKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FusedAdamKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FusedAdamWKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/FusedSgdKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/GcdLcmKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/GridSampler.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/IGammaKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Im2Col.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/IndexKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Indexing.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/LegacyThrustHelpers.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Lerp.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/LinearAlgebra.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/LogAddExpKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/LogcumsumexpKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Loss.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/LossCTC.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/MaxMinElementwiseKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/MaxUnpooling.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/MixedDtypesLinear.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/MultiLabelMarginCriterion.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/MultiMarginLoss.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/MultinomialKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/NLLLoss2d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/NaiveConvolutionTranspose2d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/NaiveConvolutionTranspose3d.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/NaiveDilatedConvolution.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Nonzero.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Normalization.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/PointwiseOpsKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/PowKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/RNN.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Randperm.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/RangeFactories.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/RecordStream.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Reduce.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceAMinMaxKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceArgMaxKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceArgMinKernel.cu.o [ 93%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceLogicKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceMaxValuesKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceMinValuesKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceMomentKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceNormKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceSumProdKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReflectionPad.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/RenormKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Repeat.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReplicationPadding.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/RreluWithNoise.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ScatterGatherKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SegmentReduce.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Shape.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SoftMax.cu.o /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Sort.cu.o /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ Remark: The warnings can be suppressed with "-diag-suppress " /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu(844): warning #191-D: type qualifier is meaningless on cast type [&] { const auto& the_type = input.scalar_type(); constexpr const char* at_dispatch_name = "host_softmax"; at::ScalarType _st = ::detail::scalar_type(the_type); ; switch (_st) { case at::ScalarType::Double: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Double)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Double), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Float: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Float)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Float), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::Half: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::Half)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::Half), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } case at::ScalarType::BFloat16: { do { if constexpr (!at::should_include_kernel_dtype( at_dispatch_name, at::ScalarType::BFloat16)) { do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str("dtype '", toString(at::ScalarType::BFloat16), "' not selected for kernel tag ", at_dispatch_name)))); }; } while (false); } } while (0); using scalar_t __attribute__((__unused__)) = c10::impl::ScalarTypeToCPPTypeT; return [&] { using accscalar_t = acc_type; if (!half_to_float) { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1L << 30L) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(880), true); } while (0); } } else { auto output_ptr = output.mutable_data_ptr(); auto input_ptr = input.const_data_ptr(); if (dim_size <= 1024 && dim_size*sizeof(scalar_t) <= 4096) { int64_t remaining = outer_size; int64_t chunk_size = (1<<30) / dim_size; while(remaining > 0) { dispatch_softmax_forward( output_ptr, input_ptr, dim_size, dim_size, std::min(remaining, chunk_size), nullptr ); input_ptr += chunk_size * dim_size; output_ptr += chunk_size * dim_size; remaining -= chunk_size; } } else { constexpr int ILP = sizeof(float4) / sizeof(scalar_t); dim3 block = SoftMaxForward_getBlockSize(dim_size); size_t smem_reduction_sz = block.x / 32 * sizeof(accscalar_t); auto max_elements_per_smem = (at::cuda::getCurrentDeviceProperties()->sharedMemPerBlock - smem_reduction_sz) / sizeof(scalar_t); bool can_use_smem = dim_size < max_elements_per_smem; can_use_smem &= !(reinterpret_cast(input_ptr) % ALIGN_BYTES); can_use_smem &= (!(reinterpret_cast(output_ptr) % ALIGN_BYTES)); can_use_smem &= !(dim_size % ILP); if (can_use_smem) { size_t smem_sz = dim_size * sizeof(scalar_t) + smem_reduction_sz; cunn_SoftMaxForwardSmem <<>>(output_ptr, input_ptr, dim_size); } else { cunn_SoftMaxForward <<>>(output_ptr, input_ptr, dim_size); } do { const cudaError_t __err = cudaGetLastError(); c10::cuda::c10_cuda_check_implementation( static_cast(__err), "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", __func__, static_cast(916), true); } while (0); } } }(); } default: do { ::c10::detail::deprecated_AT_ERROR(); if (!(false)) { ::c10::detail::torchCheckFail( __func__, "/builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu", static_cast(844), (::c10::detail::torchCheckMsgImpl( "Expected " "false" " to be true, but got false. " "(Could this error message be improved? If so, " "please report an enhancement request to PyTorch.)", ::c10::str('"', at_dispatch_name, "\" not implemented for '", toString(_st), "'")))); }; } while (false); } }() ^ [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SortImpl.cu.o /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu: In instantiation of 'at::Tensor at::native::_GLOBAL__N__08542f1a_10_SoftMax_cu_9f978f63::host_softmax(const at::Tensor&, int64_t, bool, const at::Tensor&) [with Epilogue = LogSoftMaxForwardEpilogue; bool is_log_softmax = true; int64_t = long int]': /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:1072:56: required from here /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844:2132: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] 844 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.scalar_type(), "host_softmax", [&] { | ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] 844 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.scalar_type(), "host_softmax", [&] { | /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu: In instantiation of 'at::Tensor at::native::_GLOBAL__N__08542f1a_10_SoftMax_cu_9f978f63::host_softmax(const at::Tensor&, int64_t, bool, const at::Tensor&) [with Epilogue = SoftMaxForwardEpilogue; bool is_log_softmax = false; int64_t = long int]': /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:1096:54: required from here /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844:2132: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] 844 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.scalar_type(), "host_softmax", [&] { | ^ /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] 844 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.scalar_type(), "host_softmax", [&] { | /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] /builddir/build/BUILD/pytorch/aten/src/ATen/native/cuda/SoftMax.cu:844: warning: comparison of integer expressions of different signedness: 'int64_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare] [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SortStable.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Sorting.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SparseBinaryOpIntersectionKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SparseMM.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SpectralOps.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/StepKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/SummaryOps.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorCompare.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorModeKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorShape.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorTopK.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorTransformations.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TriangularOps.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryComplexKernels.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryFractionKernels.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGammaKernels.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricAcosKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricAcoshKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricAsinKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricAsinhKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricAtanKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricAtanhKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricCosKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricCoshKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricSinKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricSinhKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricTanKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryGeometricTanhKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryLogKernels.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnaryOpsKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnarySignKernels.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnarySpecialOpsKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UnfoldBackwardKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UniqueCub.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleBicubic2d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleBilinear2d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleLinear1d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleNearest1d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleNearest2d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleNearest3d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/UpSampleTrilinear3d.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ValidateCompressedIndicesKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/WeightNorm.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ZetaKernel.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/airy_ai.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/bessel_j0.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/bessel_j1.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/bessel_y0.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/bessel_y1.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/chebyshev_polynomial_t.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/chebyshev_polynomial_u.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/chebyshev_polynomial_v.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/chebyshev_polynomial_w.cu.o [ 94%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/fused_adam_amsgrad_impl.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/fused_adam_impl.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/fused_adamw_amsgrad_impl.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/fused_adamw_impl.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/group_norm_kernel.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/hermite_polynomial_h.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/hermite_polynomial_he.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/int4mm.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/laguerre_polynomial_l.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/layer_norm_kernel.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/legendre_polynomial_p.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/modified_bessel_i0.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/modified_bessel_i1.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/modified_bessel_k0.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/modified_bessel_k1.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/scaled_modified_bessel_k0.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/scaled_modified_bessel_k1.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/shifted_chebyshev_polynomial_t.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/shifted_chebyshev_polynomial_u.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/shifted_chebyshev_polynomial_v.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/shifted_chebyshev_polynomial_w.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/spherical_bessel_j0.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/nested/cuda/NestedTensorBinaryOps.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/nested/cuda/NestedTensorMatmul.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SoftMax.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseCUDATensor.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseCUDATensorMath.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseCsrTensorMath.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseMatMul.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseSemiStructuredLinear.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseSemiStructuredOps.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/Activation.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/AffineQuantizer.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/EmbeddingBag.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/FakeQuantizeCore.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/FusedObsFakeQuant.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/IntReprQuant.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/quantized/cuda/MakePerTensorQuantizedTensor.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/attention.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/attention_backward.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim128_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim128_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim160_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim160_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim192_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim192_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim224_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim224_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim256_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim256_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim32_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim32_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim64_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim64_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim96_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_bwd_hdim96_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim128_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim128_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim160_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim192_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim192_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim224_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim256_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim256_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim32_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim32_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim64_bf16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim64_fp16_sm80.cu.o [ 95%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim96_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_hdim96_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim128_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim128_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim160_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim160_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim192_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim192_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim224_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim224_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim256_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim256_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim32_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim32_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim64_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim64_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim96_bf16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/kernels/flash_fwd_split_hdim96_fp16_sm80.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k128.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k128_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k32.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k32_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k64.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k64_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k65536.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k65536_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_bf16_aligned_k96.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k128.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k128_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k32.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k32_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k64.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k64_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k65536.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k65536_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_aligned_k96.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k128.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k128_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k32.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k32_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k64.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k64_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k65536.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f16_notaligned_k65536_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k128.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k128_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k32.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k32_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k64.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k64_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k65536.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_aligned_k65536_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k128.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k128_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k32.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k32_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k64.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k64_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k65536.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassB_f32_notaligned_k65536_dropout.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassF_bf16_aligned.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassF_f16_aligned.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassF_f16_notaligned.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassF_f32_aligned.cu.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernels/cutlassF_f32_notaligned.cu.o [ 96%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/RegisterCUDA.cpp.o [ 96%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/RegisterNestedTensorCUDA.cpp.o [ 96%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/RegisterQuantizedCUDA.cpp.o [ 96%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/RegisterSparseCUDA.cpp.o [ 96%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/RegisterSparseCsrCUDA.cpp.o [ 96%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/UfuncCUDA_add.cu.o [ 96%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CUDABlas.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CUDASparseBlas.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/CublasHandlePool.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/tunable/StreamTimer.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/cuda/tunable/Tunable.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Activation.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/LinearAlgebraStubs.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Blas.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Distributions.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Equal.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/GridSampler.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/IndexKernel.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ReduceOps.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/ScanKernels.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Sort.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Sorting.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorModeKernel.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorShapeCUDA.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/TensorTopK.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/jit_utils.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseBlas.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseBlasImpl.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseBlasLegacy.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/sparse/cuda/SparseCUDABlas.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/CudaIPCTypes.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/cuda/comm.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/cuda/memory_snapshot.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/inductor/aoti_runner/model_container_runner_cuda.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/inductor/aoti_torch/shim_cuda.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/jit/codegen/fuser/cuda/fused_kernel.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/profiler/stubs/cuda.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/autograd/functions/comm.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/jit/passes/frozen_conv_add_relu_fusion_cuda.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/jit/tensorexpr/cuda_codegen.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda.dir/__/torch/csrc/jit/runtime/register_cuda_ops.cpp.o [ 97%] Building CUDA object caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/Unique.cu.o [ 97%] Linking CXX shared library ../lib/libtorch_cuda.so Warning: Unused direct dependencies: libc10_cuda.so /lib64/libgloo_cuda.so.1 /usr/local/cuda-12.3/lib64/libcurand.so.10 libc10.so.2.4 /lib64/libgflags.so.2.2 libtorch_cpu.so.2.4 [ 97%] Built target torch_cuda [ 97%] Building CXX object caffe2/CMakeFiles/torch.dir/__/empty.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda_linalg.dir/__/aten/src/ATen/native/cuda/linalg/BatchLinearAlgebra.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda_linalg.dir/__/aten/src/ATen/native/cuda/linalg/BatchLinearAlgebraLib.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda_linalg.dir/__/aten/src/ATen/native/cuda/linalg/BatchLinearAlgebraLibBlas.cpp.o [ 97%] Linking CXX shared library ../lib/libtorch.so Warning: Unused direct dependencies: /lib64/libstdc++.so.6 libtorch_cpu.so.2.4 libtorch_cuda.so [ 97%] Built target torch [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda_linalg.dir/__/aten/src/ATen/native/cuda/linalg/CUDASolver.cpp.o [ 97%] Building CXX object caffe2/CMakeFiles/torch_cuda_linalg.dir/__/aten/src/ATen/native/cuda/linalg/CusolverDnHandlePool.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_functions_0.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_functions_1.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_functions_2.cpp.o [ 97%] Linking CXX shared library ../lib/libtorch_cuda_linalg.so Warning: Unused direct dependencies: libtorch_cpu.so.2.4 libtorch_cuda.so libc10_cuda.so /usr/local/cuda-12.3/lib64/libnvToolsExt.so.1 /lib64/libprotobuf.so.32 libc10.so.2.4 /lib64/libgflags.so.2.2 /lib64/libglog.so.0 /lib64/libqnnpack.so.1 /lib64/libgloo.so.1 /lib64/libgloo_cuda.so.1 [ 97%] Built target torch_cuda_linalg [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_functions_3.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_functions_4.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_variable_methods.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_torch_functions_0.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_torch_functions_1.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_torch_functions_2.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_nn_functions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_fft_functions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_linalg_functions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_nested_functions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_sparse_functions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_special_functions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_return_types.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/generated/python_enum_tag.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/DataLoader.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Device.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Dtype.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/DynamicTypes.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Exceptions.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Generator.cpp.o [ 97%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Layout.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/MemoryFormat.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/QScheme.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Module.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/PyInterpreter.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/python_dimname.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Size.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Storage.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/StorageMethods.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/StorageSharing.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/Stream.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/TypeInfo.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/api/src/python/init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/functions/init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/profiler_python.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_anomaly_mode.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_saved_variable_hooks.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_cpp_function.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_engine.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_function.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_hook.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_legacy_variable.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_nested_functions_manual.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_torch_functions_manual.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_variable.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/autograd/python_variable_indexing.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/python_compiled_autograd.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/cache_entry.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/cpp_shim.cpp.o [ 98%] Building C object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/cpython_defs.c.o [ 98%] Building C object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/eval_frame.c.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/extra_state.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/guards.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/dynamo/init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/functorch/init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/mps/Module.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/inductor/aoti_runner/pybind.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/backends/backend_init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/init.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/cast_all_constant_to_floating.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/deduplicate_initializers.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/eval_peephole.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/constant_fold.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/constant_map.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/eliminate_unused_items.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/fixup_onnx_controlflow.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/list_model_parameters.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/function_substitution.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/helper.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/peephole.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/preprocess_for_onnx.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/prepare_division_for_onnx.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/scalar_type_analysis.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/unpack_quantized_weights.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/remove_inplace_ops_for_onnx.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/shape_type_inference.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/function_extraction.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/onnx_log.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/naming.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/pybind_utils.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/pattern_conversion/autograd_function_process.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/pattern_conversion/common.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/pattern_conversion/pattern_encapsulation.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/passes/onnx/pattern_conversion/pattern_conversion.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_arg_flatten.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_custom_class.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_dict.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_interpreter.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_ir.cpp.o [ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_list.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_tracer.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/script_init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/frontend/concrete_module_type.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/frontend/tree_views.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_sugared_value.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_tree_views.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/runtime/static/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/tensorexpr/tensorexpr_init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/monitor/python_init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/multiprocessing/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/onnx/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/profiler/python/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/profiler/python/combined_traceback.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/serialization.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/tensor/python_tensor.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/throughput_benchmark.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/device_lazy_init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/invalid_arguments.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/nested.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/object_ptr.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/python_arg_parser.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/python_dispatch.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/python_symnode.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/pybind.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/pyobject_preservation.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/structseq.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_apply.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_dtypes.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_layouts.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_memoryformats.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_qschemes.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_list.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_new.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_numpy.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_types.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/disable_torch_function.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/verbose.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cpu/Module.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/lazy/python/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/lazy/python/python_util.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/itt.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Event.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Module.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/python_comm.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Stream.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Graph.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/shared/cudart.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/shared/nvtx.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/utils.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/CUDAPluggableAllocator.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/shared/cudnn.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/c10d/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/c10d/python_comm_hook.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/autograd/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/py_rref.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/python_functions.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/python_rpc_handler.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/request_callback_impl.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/testing/init.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/unpickled_python_call.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/rpc/unpickled_python_remote_call.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/runtime/register_distributed_ops.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/python_nccl.cpp.o [ 99%] Linking CXX shared library ../../lib/libtorch_python.so Warning: Unused direct dependencies: libshm.so.2.4 libtorch.so.2.4 libtorch_cpu.so.2.4 libtorch_cuda.so libc10_cuda.so libc10.so.2.4 [ 99%] Built target torch_python [ 99%] Building CXX object caffe2/torch/CMakeFiles/nnapi_backend.dir/csrc/jit/backends/nnapi/nnapi_backend_lib.cpp.o [ 99%] Building CXX object caffe2/torch/CMakeFiles/nnapi_backend.dir/csrc/jit/backends/nnapi/nnapi_backend_preprocess.cpp.o [ 99%] Building C object caffe2/torch/CMakeFiles/_C.dir/csrc/stub.c.o [ 99%] Building CXX object functorch/CMakeFiles/functorch.dir/csrc/dim/dim.cpp.o [ 99%] Linking C shared library ../../lib/_C.so Warning: Unused direct dependencies: /lib64/libstdc++.so.6 libtorch_python.so.2.4 [ 99%] Built target _C [ 99%] Building C object functorch/CMakeFiles/functorch.dir/csrc/dim/dim_opcode.c.o [ 99%] Building CXX object functorch/CMakeFiles/functorch.dir/csrc/init_dim_only.cpp.o [100%] Linking CXX shared library ../../lib/libnnapi_backend.so Warning: Unused direct dependencies: libtorch.so.2.4 libtorch_python.so.2.4 libtorch_cpu.so.2.4 libtorch_cuda.so libc10.so.2.4 [100%] Built target nnapi_backend [100%] Linking CXX shared module functorch.so [100%] Built target functorch + popd ~/build/BUILD/pytorch + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.HyBECb + umask 022 + cd /builddir/build/BUILD + '[' /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 '!=' / ']' + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 ++ dirname /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 + mkdir -p /builddir/build/BUILDROOT + mkdir /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 + CFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 ' ~/build/BUILD/pytorch/build ~/build/BUILD/pytorch + export CFLAGS + CXXFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 ' + export CXXFLAGS + FFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -w -fpermissive -Wno-sign-compare -Wno-deprecated-declarations -Wno-nonnull -DEIGEN_HAS_CXX11_MATH=1 -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -Wl,-lstdc++' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd pytorch + pushd build + export PYTHON_EXECUTABLE=/usr/bin/python3 + PYTHON_EXECUTABLE=/usr/bin/python3 + make install DESTDIR=/builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 [ 0%] Built target clog [ 0%] Built target fp16 [ 1%] Built target pytorch_qnnpack [ 1%] Built target fxdiv [ 1%] Built target psimd [ 71%] Built target microkernels-all [ 71%] Built target microkernels-prod [ 71%] Built target logging [ 71%] Built target hardware-config [ 71%] Built target indirection [ 71%] Built target jit [ 71%] Built target microparams-init [ 71%] Built target normalization [ 71%] Built target packing [ 71%] Built target allocator [ 71%] Built target memory [ 72%] Built target cache [ 72%] Built target microkernel-utils [ 72%] Built target mutex [ 72%] Built target post-operation [ 72%] Built target operator-utils [ 72%] Built target operators [ 72%] Built target operator-run [ 73%] Built target subgraph [ 73%] Built target convolution-test-helpers [ 73%] Built target XNNPACK [ 73%] Built target ittnotify [ 73%] Built target fmt [ 74%] Built target c10 [ 74%] Built target c10_cuda [ 74%] Built target Caffe2_PROTO [ 74%] Built target caffe2_protos [ 74%] Built target caffe2_nvrtc [ 74%] Built target ATEN_CPU_FILES_GEN_TARGET [ 90%] Built target torch_cpu [ 90%] Built target ATEN_CUDA_FILES_GEN_TARGET [ 96%] Built target torch_cuda [ 96%] Built target torch [ 96%] Built target torch_cuda_linalg [ 96%] Built target torch_global_deps [ 96%] Built target python_copy_files [ 96%] Built target shm [ 96%] Built target generate-torch-sources [ 96%] Built target torch_python_stubs [ 96%] Built target gen_torch_version [ 98%] Built target torch_python [ 98%] Built target _C [ 99%] Built target nnapi_backend [100%] Built target torch_shm_manager [100%] Built target functorch Install the project... -- Install configuration: "Release" + mkdir -p /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64 + find /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/ -name '*.a' -type f -prune -exec rm -rf '{}' + + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/python3.11 + mv -f /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libc10.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libc10.so.2.4 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libc10.so.2.4.0 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libc10_cuda.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libcaffe2_nvrtc.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libshm.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libshm.so.2.4 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libshm.so.2.4.0 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch.so.2.4 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch.so.2.4.0 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_cpu.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_cpu.so.2.4 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_cpu.so.2.4.0 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_cuda.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_cuda_linalg.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_global_deps.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_global_deps.so.2.4 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_global_deps.so.2.4.0 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_python.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_python.so.2.4 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib/libtorch_python.so.2.4.0 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/ + popd ~/build/BUILD/pytorch + install -D -pm 755 build/lib/libnnapi_backend.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/ + mkdir -p /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/torch/bin + install -D -pm 644 build/lib/_C.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/torch/ + install -D -pm 644 build/functorch/functorch.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/functorch/_C.so + install -D -pm 644 aten/src/THC/THCDeviceUtils.cuh /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/include/THC/ + ln -sf /usr/include /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/torch/include + ln -sf /usr/lib64 /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/torch/lib + ln -sf /usr/bin/torch_shm_manager /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/torch/bin/torch_shm_manager ++ find ./torch/ -name '*.py' + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/version.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/version.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/xpu/streams.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/xpu/streams.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/xpu/random.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/xpu/random.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/xpu/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/xpu/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/xpu/_gpu_trace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/xpu/_gpu_trace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/xpu/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/xpu/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/weak.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/weak.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/viz/_cycles.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/viz/_cycles.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/viz/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/viz/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/throughput_benchmark.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/throughput_benchmark.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/writer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/writer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/summary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/summary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_pytorch_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_pytorch_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_proto_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_proto_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_onnx_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_onnx_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_embedding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_embedding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_convert_np.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_convert_np.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/_caffe2_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/_caffe2_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/tensorboard/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/tensorboard/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/show_pickle.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/show_pickle.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/model_zoo.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/model_zoo.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/model_dump/__main__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/model_dump/__main__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/model_dump/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/model_dump/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/mobile_optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/mobile_optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/mkldnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/mkldnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/jit/log_extract.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/jit/log_extract.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/jit/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/jit/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/hipify/version.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/hipify/version.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/hipify/hipify_python.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/hipify/hipify_python.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/hipify/cuda_to_hip_mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/hipify/cuda_to_hip_mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/hipify/constants.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/hipify/constants.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/hipify/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/hipify/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/flop_counter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/flop_counter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/file_baton.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/file_baton.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/dlpack.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/dlpack.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/deterministic.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/deterministic.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/sampler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/sampler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/graph_settings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/graph_settings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/distributed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/distributed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/dataset.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/dataset.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/utils/snapshot.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/utils/snapshot.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/utils/decoder.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/utils/decoder.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/utils/common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/utils/common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/map/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/map/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/map/grouping.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/map/grouping.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/map/combining.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/map/combining.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/map/combinatorics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/map/combinatorics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/map/callable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/map/callable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/map/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/map/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/streamreader.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/streamreader.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/sharding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/sharding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/selecting.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/selecting.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/routeddecoder.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/routeddecoder.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/grouping.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/grouping.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/fileopener.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/fileopener.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/filelister.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/filelister.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/combining.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/combining.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/combinatorics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/combinatorics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/callable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/callable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/iter/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/iter/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/gen_pyi.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/gen_pyi.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/datapipe.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/datapipe.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/dataframe/structures.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/dataframe/structures.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/dataframe/datapipes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/dataframe/datapipes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/dataframe/dataframes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/dataframe/dataframes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/dataframe/dataframe_wrapper.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/dataframe/dataframe_wrapper.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/dataframe/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/dataframe/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/_typing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/_typing.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/_hook_iterator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/_hook_iterator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/_decorator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/_decorator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/datapipes/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/datapipes/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/dataloader.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/dataloader.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/backward_compatibility.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/backward_compatibility.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/_utils/worker.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/_utils/worker.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/_utils/signal_handling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/_utils/signal_handling.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/_utils/pin_memory.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/_utils/pin_memory.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/_utils/fetch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/_utils/fetch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/_utils/collate.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/_utils/collate.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/_utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/_utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/data/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/data/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/cpp_extension.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/cpp_extension.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/cpp_backtrace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/cpp_backtrace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/collect_env.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/collect_env.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/checkpoint.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/checkpoint.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/bundled_inputs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/bundled_inputs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/bottleneck/__main__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/bottleneck/__main__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/bottleneck/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/bottleneck/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/valgrind_wrapper/timer_interface.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/valgrind_wrapper/timer_interface.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/valgrind_wrapper/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/valgrind_wrapper/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/timer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/timer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/sparse_fuzzer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/sparse_fuzzer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/fuzzer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/fuzzer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/cpp_jit.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/cpp_jit.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/compile.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/compile.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/compare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/compare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/_stubs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/_stubs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/op_fuzzers/unary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/op_fuzzers/unary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/op_fuzzers/spectral.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/op_fuzzers/spectral.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/op_fuzzers/sparse_unary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/op_fuzzers/sparse_unary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/op_fuzzers/sparse_binary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/op_fuzzers/sparse_binary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/op_fuzzers/binary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/op_fuzzers/binary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/op_fuzzers/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/op_fuzzers/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/spectral_ops_fuzz_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/spectral_ops_fuzz_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/sparse/op_benchmark.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/sparse/op_benchmark.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/sparse/fuzzer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/sparse/fuzzer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/sparse/compare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/sparse/compare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/simple_timeit.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/simple_timeit.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/op_benchmark.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/op_benchmark.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/fuzzer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/fuzzer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/compare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/compare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/blas_compare_setup.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/blas_compare_setup.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/examples/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/examples/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/benchmark/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/benchmark/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/backend_registration.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/backend_registration.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/backcompat/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/backcompat/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_zip.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_zip.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_typing_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_typing_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_triton.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_triton.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_traceback.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_traceback.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/value_ranges.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/value_ranges.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/solve.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/solve.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/singleton_int.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/singleton_int.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/reference.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/reference.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/interp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/interp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_sympy/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_sympy/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_stats.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_stats.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_pytree.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_pytree.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_python_dispatch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_python_dispatch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_mode_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_mode_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_import_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_import_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_freeze.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_freeze.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_foreach_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_foreach_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_exposed_in.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_exposed_in.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_device.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_device.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_cxx_pytree.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_cxx_pytree.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_cpp_extension_versioner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_cpp_extension_versioner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_contextlib.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_contextlib.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_content_store.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_content_store.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/_config_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/_config_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/torch_version.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/torch_version.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/two_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/two_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/triton_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/triton_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/torchbind_impls.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/torchbind_impls.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/test_module/no_future_div.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/test_module/no_future_div.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/test_module/future_div.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/test_module/future_div.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/test_module/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/test_module/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/static_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/static_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/quantization_torch_package_models.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/quantization_torch_package_models.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/optests/make_fx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/optests/make_fx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/optests/generate_tests.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/optests/generate_tests.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/optests/fake_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/optests/fake_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/optests/autograd_registration.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/optests/autograd_registration.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/optests/aot_autograd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/optests/aot_autograd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/optests/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/optests/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/refs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/refs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/special.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/special.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/sparse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/sparse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/signal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/signal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/linalg.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/linalg.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/fft.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/fft.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/_masked.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/_masked.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/definitions/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/definitions/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/core.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/core.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/opinfo/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/opinfo/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/logging_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/logging_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/logging_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/logging_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/jit_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/jit_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/jit_metaprogramming_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/jit_metaprogramming_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/inductor_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/inductor_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/hypothesis_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/hypothesis_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/hop_db.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/hop_db.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/generated/annotated_fn_args.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/generated/annotated_fn_args.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/generated/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/generated/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/dynamo_test_failures.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/dynamo_test_failures.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/tensorpipe_rpc_agent_test_fixture.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/tensorpipe_rpc_agent_test_fixture.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/rpc_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/rpc_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/rpc_agent_test_fixture.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/rpc_agent_test_fixture.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/jit/rpc_test_faulty.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/jit/rpc_test_faulty.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/jit/rpc_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/jit/rpc_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/jit/dist_autograd_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/jit/dist_autograd_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/jit/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/jit/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/faulty_rpc_agent_test_fixture.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/faulty_rpc_agent_test_fixture.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/faulty_agent_rpc_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/faulty_agent_rpc_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/examples/reinforcement_learning_rpc_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/examples/reinforcement_learning_rpc_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/examples/parameter_server_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/examples/parameter_server_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/examples/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/examples/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/dist_optimizer_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/dist_optimizer_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/dist_autograd_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/dist_autograd_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/rpc/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/rpc/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/pipeline/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/pipeline/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/pipe_with_ddp_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/pipe_with_ddp_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/nn/api/remote_module_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/nn/api/remote_module_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/nn/api/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/nn/api/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/nn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/nn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/multi_threaded_pg.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/multi_threaded_pg.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/fake_pg.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/fake_pg.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/distributed_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/distributed_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/distributed_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/distributed_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/ddp_under_dist_autograd_test.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/ddp_under_dist_autograd_test.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/common_state_dict.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/common_state_dict.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/checkpoint_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/checkpoint_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_tensor/common_dtensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_tensor/common_dtensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_shard/test_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_shard/test_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_shard/sharded_tensor/_test_st_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_shard/sharded_tensor/_test_st_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_shard/sharded_tensor/_test_ops_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_shard/sharded_tensor/_test_ops_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_shard/sharded_tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_shard/sharded_tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/_shard/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/_shard/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/distributed/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/distributed/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/dist_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/dist_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/data/network2.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/data/network2.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/data/network1.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/data/network1.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/data/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/data/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/custom_op_db.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/custom_op_db.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/composite_compliance.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/composite_compliance.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_subclass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_subclass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_quantized.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_quantized.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_quantization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_quantization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_pruning.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_pruning.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_optimizers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_optimizers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_nn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_nn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_modules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_modules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_mkldnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_mkldnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_methods_invocations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_methods_invocations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_jit.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_jit.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_fsdp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_fsdp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_dtype.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_dtype.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_distributed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_distributed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_dist_composable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_dist_composable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_device_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_device_type.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/common_cuda.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/common_cuda.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/codegen/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/codegen/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/check_kernel_launches.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/check_kernel_launches.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/autograd_function_db.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/autograd_function_db.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/autocast_test_lists.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/autocast_test_lists.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_internal/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_internal/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_creation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_creation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/_comparison.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/_comparison.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/testing/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/testing/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/storage.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/storage.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/special/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/special/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/sparse/semi_structured.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/sparse/semi_structured.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/sparse/_triton_ops_meta.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/sparse/_triton_ops_meta.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/sparse/_triton_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/sparse/_triton_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/sparse/_semi_structured_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/sparse/_semi_structured_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/sparse/_semi_structured_conversions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/sparse/_semi_structured_conversions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/sparse/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/sparse/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/signal/windows/windows.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/signal/windows/windows.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/signal/windows/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/signal/windows/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/signal/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/signal/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/serialization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/serialization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/return_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/return_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/random.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/random.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quasirandom.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quasirandom.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/stubs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/stubs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/quantize_jit.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/quantize_jit.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/quantize_fx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/quantize_fx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/quantize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/quantize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/quantization_mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/quantization_mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/quant_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/quant_type.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/qconfig.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/qconfig.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/observer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/observer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/quantization_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/quantization_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/quantization_patterns.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/quantization_patterns.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/prepare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/prepare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/pattern_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/pattern_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/match_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/match_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/graph_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/graph_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/fusion_patterns.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/fusion_patterns.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/fuse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/fuse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/convert.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/convert.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/_equalize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/_equalize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fx/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fx/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fuser_method_mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fuser_method_mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fuse_modules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fuse_modules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/fake_quantize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/fake_quantize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/_quantized_conversions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/_quantized_conversions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/_numeric_suite_fx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/_numeric_suite_fx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/_numeric_suite.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/_numeric_suite.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/quantization/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/quantization/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/python_tracer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/python_tracer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/itt.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/itt.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/_pattern_matcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/_pattern_matcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/_memory_profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/_memory_profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/profiler/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/profiler/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/package_importer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/package_importer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/package_exporter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/package_exporter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/importer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/importer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/glob_group.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/glob_group.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/find_file_dependencies.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/find_file_dependencies.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/file_structure_representation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/file_structure_representation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/analyze/trace_dependencies.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/analyze/trace_dependencies.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/analyze/is_from_package.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/analyze/is_from_package.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/analyze/find_first_use_of_broken_modules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/analyze/find_first_use_of_broken_modules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/analyze/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/analyze/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_stdlib.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_stdlib.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_package_unpickler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_package_unpickler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_package_pickler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_package_pickler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_mock.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_mock.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_mangling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_mangling.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_importlib.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_importlib.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_directory_reader.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_directory_reader.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/_digraph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/_digraph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/package/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/package/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/overrides.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/overrides.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/swa_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/swa_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/sparse_adam.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/sparse_adam.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/sgd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/sgd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/rprop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/rprop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/rmsprop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/rmsprop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/radam.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/radam.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/nadam.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/nadam.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/lr_scheduler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/lr_scheduler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/lbfgs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/lbfgs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/asgd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/asgd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/adamw.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/adamw.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/adamax.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/adamax.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/adam.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/adam.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/adagrad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/adagrad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/adadelta.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/adadelta.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/_multi_tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/_multi_tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/_functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/_functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/optim/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/optim/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/verification.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/verification.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset9.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset9.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset8.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset8.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset7.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset7.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset20.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset20.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset19.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset19.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset18.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset18.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset17.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset17.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset16.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset16.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset15.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset15.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset14.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset14.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset13.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset13.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset12.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset12.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset11.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset11.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_opset10.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_opset10.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_helper.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_helper.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/symbolic_caffe2.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/symbolic_caffe2.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/operators.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/operators.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/errors.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/errors.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_type_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_type_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_onnx_supported_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_onnx_supported_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/registration.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/registration.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/onnxruntime.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/onnxruntime.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/onnx_proto_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/onnx_proto_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/jit_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/jit_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/io_adapter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/io_adapter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/type_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/type_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/torch_export_graph_extractor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/torch_export_graph_extractor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/serialization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/serialization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/registration.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/registration.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/patcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/patcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/virtualization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/virtualization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/type_promotion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/type_promotion.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/readability.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/readability.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/modularization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/modularization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/functionalization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/functionalization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/decomp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/decomp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/passes/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/passes/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/op_validation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/op_validation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/onnxfunction_dispatcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/onnxfunction_dispatcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/fx_symbolic_graph_extractor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/fx_symbolic_graph_extractor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/fx_onnx_interpreter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/fx_onnx_interpreter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/dynamo_graph_extractor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/dynamo_graph_extractor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/diagnostics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/diagnostics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/decomposition_table.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/decomposition_table.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/decomposition_skip.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/decomposition_skip.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/analysis/unsupported_nodes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/analysis/unsupported_nodes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/analysis/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/analysis/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/fx/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/fx/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/exporter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/exporter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/version.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/version.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_web_response.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_web_response.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_web_request.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_web_request.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_version_control_details.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_version_control_details.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_translation_metadata.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_translation_metadata.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_tool_component_reference.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_tool_component_reference.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_tool_component.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_tool_component.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_tool.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_tool.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_thread_flow_location.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_thread_flow_location.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_thread_flow.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_thread_flow.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_suppression.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_suppression.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_stack_frame.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_stack_frame.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_stack.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_stack.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_special_locations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_special_locations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_sarif_log.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_sarif_log.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_run_automation_details.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_run_automation_details.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_run.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_run.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_result_provenance.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_result_provenance.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_result.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_result.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_descriptor_relationship.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_descriptor_relationship.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_descriptor_reference.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_descriptor_reference.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_descriptor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_descriptor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_configuration.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_reporting_configuration.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_replacement.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_replacement.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_region.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_region.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_rectangle.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_rectangle.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_property_bag.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_property_bag.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_physical_location.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_physical_location.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_notification.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_notification.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_node.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_node.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_multiformat_message_string.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_multiformat_message_string.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_message.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_message.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_logical_location.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_logical_location.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_location_relationship.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_location_relationship.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_location.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_location.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_invocation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_invocation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_graph_traversal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_graph_traversal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_fix.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_fix.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_external_property_file_references.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_external_property_file_references.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_external_property_file_reference.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_external_property_file_reference.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_external_properties.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_external_properties.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_exception.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_exception.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_edge_traversal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_edge_traversal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_edge.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_edge.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_conversion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_conversion.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_configuration_override.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_configuration_override.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_code_flow.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_code_flow.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_attachment.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_attachment.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_artifact_location.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_artifact_location.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_artifact_content.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_artifact_content.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_artifact_change.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_artifact_change.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_artifact.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_artifact.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/_address.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/_address.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/sarif/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/sarif/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/formatter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/formatter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/decorator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/decorator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/context.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/context.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/_infra.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/_infra.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/infra/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/infra/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/_rules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/_rules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/_diagnostic.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/_diagnostic.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/diagnostics/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/diagnostics/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/_beartype.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/_beartype.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_internal/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_internal/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_globals.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_globals.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_exporter_states.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_exporter_states.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_experimental.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_experimental.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_deprecation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_deprecation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/_constants.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/_constants.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/onnx/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/onnx/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/weight_norm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/weight_norm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/stateless.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/stateless.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/spectral_norm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/spectral_norm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/prune.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/prune.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/parametrize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/parametrize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/parametrizations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/parametrizations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/memory_format.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/memory_format.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/init.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/init.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/fusion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/fusion.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/convert_parameters.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/convert_parameters.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/clip_grad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/clip_grad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_per_sample_grad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_per_sample_grad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_named_member_accessor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_named_member_accessor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/linear_expanded_weights.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/linear_expanded_weights.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/layer_norm_expanded_weights.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/layer_norm_expanded_weights.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/instance_norm_expanded_weights.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/instance_norm_expanded_weights.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/group_norm_expanded_weights.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/group_norm_expanded_weights.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/expanded_weights_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/expanded_weights_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/expanded_weights_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/expanded_weights_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/embedding_expanded_weights.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/embedding_expanded_weights.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/conv_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/conv_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/conv_expanded_weights.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/conv_expanded_weights.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_expanded_weights/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_expanded_weights/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/_deprecation_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/_deprecation_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/normalization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/normalization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/functional_modules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/functional_modules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/embedding_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/embedding_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/dropout.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/dropout.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/batchnorm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/batchnorm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/activation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/activation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/dynamic/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/dynamic/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/dynamic/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/dynamic/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/dynamic/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/dynamic/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/dynamic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/dynamic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/modules/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/modules/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/modules/sparse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/modules/sparse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/_reference/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/_reference/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantized/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantized/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantizable/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantizable/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantizable/modules/activation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantizable/modules/activation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantizable/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantizable/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/quantizable/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/quantizable/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/modules/embedding_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/modules/embedding_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/dynamic/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/dynamic/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/dynamic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/dynamic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/qat/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/qat/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parameter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parameter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/scatter_gather.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/scatter_gather.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/replicate.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/replicate.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/parallel_apply.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/parallel_apply.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/distributed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/distributed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/data_parallel.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/data_parallel.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/comm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/comm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/_functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/parallel/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/parallel/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/upsampling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/upsampling.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/transformer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/transformer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/sparse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/sparse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/pooling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/pooling.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/pixelshuffle.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/pixelshuffle.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/padding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/padding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/normalization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/normalization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/loss.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/loss.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/lazy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/lazy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/instancenorm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/instancenorm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/fold.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/fold.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/flatten.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/flatten.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/dropout.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/dropout.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/distance.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/distance.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/container.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/container.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/channelshuffle.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/channelshuffle.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/batchnorm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/batchnorm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/adaptive.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/adaptive.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/activation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/activation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/_functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/modules/linear_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/modules/linear_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/modules/conv_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/modules/conv_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/modules/bn_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/modules/bn_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/dynamic/modules/linear_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/dynamic/modules/linear_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/dynamic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/dynamic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/quantized/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/quantized/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/qat/modules/linear_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/qat/modules/linear_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/qat/modules/linear_fused.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/qat/modules/linear_fused.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/qat/modules/conv_fused.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/qat/modules/conv_fused.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/qat/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/qat/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/qat/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/qat/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/modules/fused.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/modules/fused.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/intrinsic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/intrinsic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/init.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/init.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/grad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/grad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/cpp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/cpp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/common_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/common_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/backends/thnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/backends/thnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/backends/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/backends/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/attention/bias.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/attention/bias.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/attention/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/attention/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/attention/_templated_attention.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/attention/_templated_attention.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/attention/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/attention/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/_reduction.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/_reduction.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nested/_internal/sdpa.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nested/_internal/sdpa.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nested/_internal/ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nested/_internal/ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nested/_internal/nested_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nested/_internal/nested_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nested/_internal/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nested/_internal/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/nested/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/nested/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/multiprocessing/spawn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/multiprocessing/spawn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/multiprocessing/reductions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/multiprocessing/reductions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/multiprocessing/queue.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/multiprocessing/queue.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/multiprocessing/pool.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/multiprocessing/pool.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/multiprocessing/_atfork.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/multiprocessing/_atfork.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/multiprocessing/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/multiprocessing/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/mps/profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/mps/profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/mps/event.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/mps/event.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/mps/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/mps/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/monitor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/monitor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/unary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/unary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/reductions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/reductions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/passthrough.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/passthrough.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/creation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/creation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/core.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/core.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/binary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/binary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/_ops_refs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/_ops_refs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/maskedtensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/maskedtensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/_docs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/_docs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/masked/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/masked/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/linalg/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/linalg/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/library.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/library.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/unsupported_tensor_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/unsupported_tensor_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/supported_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/supported_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/quantized.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/quantized.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/mobile/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/mobile/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/generate_bytecode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/generate_bytecode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/frontend.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/frontend.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/annotations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/annotations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_trace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_trace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_state.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_state.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_shape_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_shape_functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_serialization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_serialization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_script.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_script.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_recursive.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_recursive.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_pickle.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_pickle.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_passes/_property_propagation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_passes/_property_propagation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_passes/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_passes/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_monkeytype_config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_monkeytype_config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_logging.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_logging.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_ir_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_ir_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_fuser.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_fuser.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_freeze.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_freeze.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_decompositions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_decompositions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_decomposition_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_decomposition_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_dataclass_impls.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_dataclass_impls.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_check.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_check.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_builtins.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_builtins.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_await.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_await.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/_async.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/_async.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/jit/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/jit/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/hub.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/hub.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/traceback.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/traceback.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/tensor_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/tensor_type.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/subgraph_rewriter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/subgraph_rewriter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/proxy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/proxy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/utils/source_matcher_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/utils/source_matcher_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/utils/matcher_with_name_node_map_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/utils/matcher_with_name_node_map_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/utils/matcher_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/utils/matcher_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/utils/fuser_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/utils/fuser_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/utils/common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/utils/common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/tools_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/tools_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/tests/test_pass_manager.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/tests/test_pass_manager.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/tests/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/tests/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/splitter_base.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/splitter_base.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/split_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/split_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/split_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/split_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/shape_prop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/shape_prop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/reinplace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/reinplace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/pass_manager.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/pass_manager.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/param_fetch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/param_fetch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/operator_support.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/operator_support.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/net_min_base.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/net_min_base.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/infra/pass_manager.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/infra/pass_manager.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/infra/pass_base.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/infra/pass_base.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/infra/partitioner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/infra/partitioner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/infra/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/infra/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/graph_manipulation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/graph_manipulation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/graph_drawer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/graph_drawer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/fake_tensor_prop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/fake_tensor_prop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/dialect/common/cse_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/dialect/common/cse_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/dialect/common/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/dialect/common/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/dialect/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/dialect/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/backends/cudagraphs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/backends/cudagraphs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/backends/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/backends/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/annotate_getitem_nodes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/annotate_getitem_nodes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/passes/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/passes/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/operator_schemas.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/operator_schemas.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/node.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/node.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/interpreter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/interpreter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/immutable_collections.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/immutable_collections.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/graph_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/graph_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/validator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/validator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unify_refinements.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unify_refinements.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/variable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/variable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/unification_tools.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/unification_tools.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/multipledispatch/variadic.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/multipledispatch/variadic.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/multipledispatch/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/multipledispatch/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/multipledispatch/dispatcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/multipledispatch/dispatcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/multipledispatch/core.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/multipledispatch/core.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/multipledispatch/conflict.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/multipledispatch/conflict.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/multipledispatch/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/multipledispatch/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/more.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/more.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/match.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/match.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/dispatch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/dispatch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/core.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/core.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/unification/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/unification/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/symbolic_shapes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/symbolic_shapes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/sym_node.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/sym_node.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/shape_inference/infer_symbol_values.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/shape_inference/infer_symbol_values.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/shape_inference/infer_shape.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/shape_inference/infer_shape.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/schema_type_annotation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/schema_type_annotation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/rewriter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/rewriter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/refinement_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/refinement_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/recording.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/recording.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/proxy_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/proxy_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/partitioner_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/partitioner_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/optimization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/optimization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/normalize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/normalize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/z3_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/z3_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/util.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/util.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/transform_to_z3.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/transform_to_z3.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/operation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/operation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/constraint_transformation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/constraint_transformation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/constraint_generator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/constraint_generator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/constraint.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/constraint.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/migrate_gradual_types/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/migrate_gradual_types/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/meta_tracer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/meta_tracer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/merge_matmul.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/merge_matmul.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/graph_gradual_typechecker.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/graph_gradual_typechecker.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/debug.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/debug.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/const_fold.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/const_fold.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/accelerator_partitioner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/accelerator_partitioner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/_sym_dispatch_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/_sym_dispatch_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/_config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/_config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/_backward_state.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/_backward_state.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/experimental/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/experimental/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/annotate.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/annotate.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/_symbolic_trace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/_symbolic_trace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/_pytree.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/_pytree.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/_lazy_graph_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/_lazy_graph_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/_compatibility.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/_compatibility.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fx/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fx/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/futures/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/futures/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/func/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/func/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/fft/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/fft/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/unflatten.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/unflatten.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/graph_signature.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/graph_signature.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/exported_program.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/exported_program.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/dynamic_shapes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/dynamic_shapes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/custom_obj.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/custom_obj.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/_unlift.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/_unlift.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/_tree_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/_tree_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/_trace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/_trace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/_safeguard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/_safeguard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/_remove_effect_tokens_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/_remove_effect_tokens_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/_remove_auto_functionalized_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/_remove_auto_functionalized_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/export/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/export/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/wishart.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/wishart.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/weibull.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/weibull.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/von_mises.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/von_mises.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/uniform.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/uniform.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/transforms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/transforms.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/transformed_distribution.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/transformed_distribution.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/studentT.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/studentT.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/relaxed_categorical.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/relaxed_categorical.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/relaxed_bernoulli.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/relaxed_bernoulli.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/poisson.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/poisson.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/pareto.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/pareto.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/one_hot_categorical.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/one_hot_categorical.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/normal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/normal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/negative_binomial.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/negative_binomial.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/multivariate_normal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/multivariate_normal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/multinomial.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/multinomial.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/mixture_same_family.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/mixture_same_family.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/lowrank_multivariate_normal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/lowrank_multivariate_normal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/logistic_normal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/logistic_normal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/log_normal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/log_normal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/lkj_cholesky.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/lkj_cholesky.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/laplace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/laplace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/kumaraswamy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/kumaraswamy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/kl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/kl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/inverse_gamma.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/inverse_gamma.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/independent.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/independent.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/half_normal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/half_normal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/half_cauchy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/half_cauchy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/gumbel.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/gumbel.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/geometric.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/geometric.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/gamma.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/gamma.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/fishersnedecor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/fishersnedecor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/exponential.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/exponential.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/exp_family.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/exp_family.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/distribution.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/distribution.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/dirichlet.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/dirichlet.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/continuous_bernoulli.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/continuous_bernoulli.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/constraints.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/constraints.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/constraint_registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/constraint_registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/chi2.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/chi2.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/cauchy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/cauchy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/categorical.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/categorical.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/binomial.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/binomial.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/beta.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/beta.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/bernoulli.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/bernoulli.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributions/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributions/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/style.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/style.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/loss.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/loss.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/input_reshard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/input_reshard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/fsdp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/fsdp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/ddp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/ddp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/_data_parallel_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/_data_parallel_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/parallel/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/parallel/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/run.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/run.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/server_process_global_profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/server_process_global_profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/rref_proxy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/rref_proxy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/options.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/options.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/internal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/internal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/constants.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/constants.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/backend_registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/backend_registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/_testing/faulty_agent_backend_registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/_testing/faulty_agent_backend_registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/_testing/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/_testing/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rpc/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rpc/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/rendezvous.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/rendezvous.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/remote_device.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/remote_device.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/worker.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/worker.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/stream.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/stream.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/skip/tracker.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/skip/tracker.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/skip/skippable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/skip/skippable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/skip/portal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/skip/portal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/skip/namespace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/skip/namespace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/skip/layout.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/skip/layout.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/skip/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/skip/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/pipeline.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/pipeline.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/pipe.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/pipe.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/phony.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/phony.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/microbatch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/microbatch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/dependency.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/dependency.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/copy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/copy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/checkpoint.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/checkpoint.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/batchnorm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/batchnorm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/_balance/profile.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/_balance/profile.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/_balance/blockpartition.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/_balance/blockpartition.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/_balance/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/_balance/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/sync/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/sync/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/pipeline/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/pipeline/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/zero_redundancy_optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/zero_redundancy_optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/post_localSGD_optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/post_localSGD_optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/named_optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/named_optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_sgd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_sgd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_rprop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_rprop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_rmsprop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_rmsprop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_adamw.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_adamw.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_adamax.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_adamax.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_adam.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_adam.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_adagrad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_adagrad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/functional_adadelta.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/functional_adadelta.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/apply_optimizer_in_backward.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/apply_optimizer_in_backward.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/optim/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/optim/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/jit/templates/remote_module_template.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/jit/templates/remote_module_template.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/jit/templates/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/jit/templates/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/jit/instantiator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/jit/instantiator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/jit/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/jit/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/api/remote_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/api/remote_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/api/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/api/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/nn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/nn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/logging_handlers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/logging_handlers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/launcher/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/launcher/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/launcher/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/launcher/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/launch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/launch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/wrap.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/wrap.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/sharded_grad_scaler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/sharded_grad_scaler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/fully_sharded_data_parallel.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/fully_sharded_data_parallel.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_wrap_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_wrap_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_unshard_param_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_unshard_param_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_traversal_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_traversal_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_trace_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_trace_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_state_dict_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_state_dict_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_shard_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_shard_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_runtime_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_runtime_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_optim_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_optim_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_limiter_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_limiter_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_init_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_init_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_fsdp_extensions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_fsdp_extensions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_flat_param.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_flat_param.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_exec_order_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_exec_order_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_dynamo_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_dynamo_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_debug_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_debug_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/_common_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/_common_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/fsdp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/fsdp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/examples/memory_tracker_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/examples/memory_tracker_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/store.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/store.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/logging.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/logging.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/log_level.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/log_level.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/distributed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/distributed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/data/elastic_distributed_sampler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/data/elastic_distributed_sampler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/data/cycling_iterator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/data/cycling_iterator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/data/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/data/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/utils/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/utils/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/timer/local_timer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/timer/local_timer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/timer/file_based_local_timer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/timer/file_based_local_timer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/timer/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/timer/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/timer/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/timer/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/static_tcp_rendezvous.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/static_tcp_rendezvous.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/etcd_store.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/etcd_store.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/etcd_server.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/etcd_server.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/etcd_rendezvous_backend.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/etcd_rendezvous_backend.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/etcd_rendezvous.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/etcd_rendezvous.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/dynamic_rendezvous.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/dynamic_rendezvous.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/c10d_rendezvous_backend.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/c10d_rendezvous_backend.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/rendezvous/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/rendezvous/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/tail_log.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/tail_log.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/subprocess_handler/subprocess_handler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/subprocess_handler/subprocess_handler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/subprocess_handler/handlers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/subprocess_handler/handlers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/subprocess_handler/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/subprocess_handler/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/redirects.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/redirects.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/errors/handlers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/errors/handlers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/errors/error_handler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/errors/error_handler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/errors/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/errors/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/multiprocessing/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/multiprocessing/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/metrics/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/metrics/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/metrics/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/metrics/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/events/handlers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/events/handlers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/events/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/events/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/events/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/events/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/agent/server/local_elastic_agent.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/agent/server/local_elastic_agent.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/agent/server/health_check_server.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/agent/server/health_check_server.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/agent/server/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/agent/server/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/agent/server/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/agent/server/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/agent/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/agent/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/elastic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/elastic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/distributed_c10d.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/distributed_c10d.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/device_mesh.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/device_mesh.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/constants.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/constants.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/collective_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/collective_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/storage.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/storage.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/stateful.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/stateful.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/state_dict_saver.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/state_dict_saver.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/state_dict_loader.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/state_dict_loader.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/state_dict.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/state_dict.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/resharding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/resharding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/planner_helpers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/planner_helpers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/planner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/planner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/metadata.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/metadata.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/logging_handlers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/logging_handlers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/logger.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/logger.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/format_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/format_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/filesystem.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/filesystem.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/examples/stateful_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/examples/stateful_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/examples/fsdp_checkpoint_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/examples/fsdp_checkpoint_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/examples/async_checkpointing_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/examples/async_checkpointing_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/default_planner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/default_planner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_traverse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_traverse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_storage_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_storage_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_sharded_tensor_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_sharded_tensor_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_nested_dict.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_nested_dict.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_fsspec_filesystem.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_fsspec_filesystem.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_dedup_tensors.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_dedup_tensors.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_dedup_save_plans.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_dedup_save_plans.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/_checkpointer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/_checkpointer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/checkpoint/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/checkpoint/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/c10d_logger.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/c10d_logger.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/benchmarks/benchmark_ddp_rpc.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/benchmarks/benchmark_ddp_rpc.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/autograd/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/autograd/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/argparse_util.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/argparse_util.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/model_averaging/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/model_averaging/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/model_averaging/hierarchical_model_averager.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/model_averaging/hierarchical_model_averager.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/model_averaging/averagers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/model_averaging/averagers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/model_averaging/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/model_averaging/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/join.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/join.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/post_localSGD_hook.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/post_localSGD_hook.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/optimizer_overlap_hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/optimizer_overlap_hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/mixed_precision_hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/mixed_precision_hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/debugging_hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/debugging_hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/ddp_zero_hook.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/ddp_zero_hook.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/ddp_comm_hooks/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/ddp_comm_hooks/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_quantization/quantization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_quantization/quantization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_quantization/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_quantization/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_optimizer_overlap/optimizer_overlap.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_optimizer_overlap/optimizer_overlap.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_optimizer_overlap/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_optimizer_overlap/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_comm_hooks/default_hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_comm_hooks/default_hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_comm_hooks/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_comm_hooks/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/_checkpoint/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/_checkpoint/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/algorithms/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/algorithms/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tools/memory_tracker.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tools/memory_tracker.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tools/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tools/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/tp_conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/tp_conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/sharding_prop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/sharding_prop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/redistribute.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/redistribute.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/random.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/random.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/placement_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/placement_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/view_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/view_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/tensor_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/tensor_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/random_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/random_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/pointwise_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/pointwise_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/matrix_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/matrix_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/math_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/math_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/experimental_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/experimental_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/embedding_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/embedding_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/conv_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/conv_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/common_rules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/common_rules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/basic_strategy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/basic_strategy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/ops/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/ops/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/op_schema.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/op_schema.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/experimental/tp_transform.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/experimental/tp_transform.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/experimental/attention.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/experimental/attention.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/experimental/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/experimental/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/examples/visualize_sharding_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/examples/visualize_sharding_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/examples/torchrec_sharding_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/examples/torchrec_sharding_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/examples/convnext_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/examples/convnext_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/examples/checkpoint_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/examples/checkpoint_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/dispatch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/dispatch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/device_mesh.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/device_mesh.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/debug/visualize_sharding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/debug/visualize_sharding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/debug/op_coverage.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/debug/op_coverage.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/debug/comm_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/debug/comm_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/debug/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/debug/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/_collective_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/_collective_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_state_dict_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_state_dict_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/partial_lower.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/partial_lower.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/parallel_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/parallel_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/log_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/log_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/iter_graph_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/iter_graph_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/graph_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/graph_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/graph_optimization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/graph_optimization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/gm_transformation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/gm_transformation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/experimental_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/experimental_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/distribute.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/distribute.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/data_parallel.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/data_parallel.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/comm_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/comm_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/batch_dim_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/batch_dim_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_spmd/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_spmd/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_sharding_spec/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_sharding_spec/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_sharded_tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_sharded_tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/embedding_bag.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/embedding_bag.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/embedding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/embedding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/chunk_sharding_spec_ops/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/chunk_sharding_spec.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/chunk_sharding_spec.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/_internals.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/_internals.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_spec/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_spec/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_plan/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_plan/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharding_plan/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharding_plan/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharder.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharder.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/shard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/shard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/reshard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/reshard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/metadata.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/metadata.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/logging_handlers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/logging_handlers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/logger.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/logger.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/_ops/tensor_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/_ops/tensor_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/_ops/misc_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/_ops/misc_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/_ops/init.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/_ops/init.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/_ops/binary_cmp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/_ops/binary_cmp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/_ops/_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/_ops/_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/_ops/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/_ops/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_tensor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_tensor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_optim/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_optim/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/sharded_optim/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/sharded_optim/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/op_registry_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/op_registry_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/metadata.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/metadata.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/common_op_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/common_op_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/checkpoint/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/checkpoint/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_shard/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_shard/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_functional_collectives_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_functional_collectives_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_functional_collectives.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_functional_collectives.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable_state.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable_state.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/replicate.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/replicate.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fully_shard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fully_shard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/fully_shard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/fully_shard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_state.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_state.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_param_group.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_param_group.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_param.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_param.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_init.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_init.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_collectives.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_collectives.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/_fsdp_api.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/_fsdp_api.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/fsdp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/fsdp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/contract.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/contract.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/checkpoint_activation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/checkpoint_activation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/_composable/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/_composable/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/distributed/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/distributed/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/streams.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/streams.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/sparse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/sparse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/random.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/random.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/nvtx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/nvtx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/nccl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/nccl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/memory.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/memory.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/jiterator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/jiterator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/graphs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/graphs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/error.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/error.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/comm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/comm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/amp/grad_scaler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/amp/grad_scaler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/amp/common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/amp/common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/amp/autocast_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/amp/autocast_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/amp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/amp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/_sanitizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/_sanitizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/_memory_viz.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/_memory_viz.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/_gpu_trace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/_gpu_trace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cuda/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cuda/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/csrc/lazy/test_mnist.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/csrc/lazy/test_mnist.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/csrc/jit/tensorexpr/scripts/bisect.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/csrc/jit/tensorexpr/scripts/bisect.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/csrc/jit/tensorexpr/codegen_external.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/csrc/jit/tensorexpr/codegen_external.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cpu/amp/grad_scaler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cpu/amp/grad_scaler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cpu/amp/autocast_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cpu/amp/autocast_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cpu/amp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cpu/amp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/cpu/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/cpu/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/contrib/_tensorboard_vis.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/contrib/_tensorboard_vis.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/contrib/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/contrib/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/compiler/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/compiler/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/xnnpack/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/xnnpack/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/xeon/run_cpu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/xeon/run_cpu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/xeon/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/xeon/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/quantized/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/quantized/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/opt_einsum/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/opt_einsum/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/openmp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/openmp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/nnpack/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/nnpack/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/mps/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/mps/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/mkldnn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/mkldnn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/mkl/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/mkl/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/mha/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/mha/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/cudnn/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/cudnn/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/cudnn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/cudnn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/cuda/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/cuda/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/cpu/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/cpu/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/_nnapi/serializer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/_nnapi/serializer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/_nnapi/prepare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/_nnapi/prepare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/_nnapi/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/_nnapi/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/_coreml/preprocess.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/_coreml/preprocess.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/_coreml/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/_coreml/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/backends/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/backends/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/variable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/variable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/profiler_util.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/profiler_util.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/profiler_legacy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/profiler_legacy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/gradcheck.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/gradcheck.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/grad_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/grad_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/forward_ad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/forward_ad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/anomaly_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/anomaly_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/_functions/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/_functions/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/_functions/tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/_functions/tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/_functions/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/_functions/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/autograd/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/autograd/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/stubs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/stubs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/xnnpack_quantizer_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/xnnpack_quantizer_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/xnnpack_quantizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/xnnpack_quantizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/x86_inductor_quantizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/x86_inductor_quantizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/quantizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/quantizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/embedding_quantizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/embedding_quantizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/composable_quantizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/composable_quantizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantizer/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantizer/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantize_pt2e.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantize_pt2e.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantize_jit.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantize_jit.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantize_fx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantize_fx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quantization_mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quantization_mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/quant_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/quant_type.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/qconfig_mapping.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/qconfig_mapping.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/qconfig.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/qconfig.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/representation/rewrite.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/representation/rewrite.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/representation/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/representation/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/qat_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/qat_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/prepare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/prepare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/port_metadata_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/port_metadata_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/graph_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/graph_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/generate_numeric_debug_handle.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/generate_numeric_debug_handle.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/export_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/export_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/duplicate_dq_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/duplicate_dq_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/pt2e/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/pt2e/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/observer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/observer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/tracer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/tracer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/quantize_handler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/quantize_handler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/qconfig_mapping_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/qconfig_mapping_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/prepare.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/prepare.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/pattern_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/pattern_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/match_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/match_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/lstm_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/lstm_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/lower_to_qnnpack.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/lower_to_qnnpack.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/lower_to_fbgemm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/lower_to_fbgemm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/graph_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/graph_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/fuse_handler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/fuse_handler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/fuse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/fuse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/custom_config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/custom_config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/convert.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/convert.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_model_report/model_report_visualizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_model_report/model_report_visualizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_model_report/model_report_observer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_model_report/model_report_observer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_model_report/model_report.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_model_report/model_report.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_model_report/detector.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_model_report/detector.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_model_report/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_model_report/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_lower_to_native_backend.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_lower_to_native_backend.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_equalize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_equalize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/_decomposed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/_decomposed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fx/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fx/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fuser_method_mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fuser_method_mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fuse_modules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fuse_modules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/fake_quantize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/fake_quantize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/quantizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/quantizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/qconfig.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/qconfig.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/observer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/observer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/fake_quantize_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/fake_quantize_function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/fake_quantize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/fake_quantize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/apot_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/apot_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/experimental/APoT_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/experimental/APoT_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/x86.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/x86.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/tensorrt.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/tensorrt.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/qnnpack.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/qnnpack.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/onednn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/onednn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/observation_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/observation_type.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/native.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/native.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/fbgemm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/fbgemm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/executorch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/executorch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/backend_config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/backend_config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/_qnnpack_pt2e.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/_qnnpack_pt2e.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/_common_operator_config_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/_common_operator_config_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/backend_config/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/backend_config/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/_learnable_fake_quantize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/_learnable_fake_quantize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/_equalize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/_equalize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/_correct_bias.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/_correct_bias.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/quantization/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/quantization/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/sparsifier/weight_norm_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/sparsifier/weight_norm_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/sparsifier/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/sparsifier/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/sparsifier/nearly_diagonal_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/sparsifier/nearly_diagonal_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/sparsifier/base_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/sparsifier/base_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/sparsifier/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/sparsifier/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/scheduler/lambda_scheduler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/scheduler/lambda_scheduler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/scheduler/cubic_scheduler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/scheduler/cubic_scheduler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/scheduler/base_scheduler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/scheduler/base_scheduler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/scheduler/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/scheduler/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/saliency_pruner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/saliency_pruner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/prune_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/prune_functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/parametrization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/parametrization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/match_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/match_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/lstm_saliency_pruner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/lstm_saliency_pruner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/base_structured_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/base_structured_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/pruner/FPGM_pruner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/pruner/FPGM_pruner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/quantization_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/quantization_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/lightning/tests/test_callbacks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/lightning/tests/test_callbacks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/lightning/callbacks/data_sparsity.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/lightning/callbacks/data_sparsity.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/lightning/callbacks/_data_sparstity_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/lightning/callbacks/_data_sparstity_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/lightning/callbacks/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/lightning/callbacks/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/lightning/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/lightning/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/data_norm_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/data_norm_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/evaluate_model_metrics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/evaluate_model_metrics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/evaluate_forward_time.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/evaluate_forward_time.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/evaluate_disk_savings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/evaluate_disk_savings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/dlrm_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/benchmarks/dlrm_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/base_data_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/base_data_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_sparsifier/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_sparsifier/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_scheduler/base_data_scheduler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_scheduler/base_data_scheduler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/data_scheduler/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/data_scheduler/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/activation_sparsifier/activation_sparsifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/activation_sparsifier/activation_sparsifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/activation_sparsifier/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/activation_sparsifier/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/_experimental/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/_experimental/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/pruning/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/pruning/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/weight_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/weight_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/qconfig_multi_mapping.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/qconfig_multi_mapping.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/pattern_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/pattern_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/ns_types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/ns_types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/n_shadows_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/n_shadows_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/mappings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/mappings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/graph_passes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/graph_passes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/graph_matcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/graph_matcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/fx/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/fx/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/_numeric_suite_fx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/_numeric_suite_fx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/_numeric_suite.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/_numeric_suite.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/ns/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/ns/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/sparse/quantized/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/sparse/quantized/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/sparse/quantized/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/sparse/quantized/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/sparse/quantized/dynamic/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/sparse/quantized/dynamic/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/sparse/quantized/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/sparse/quantized/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/sparse/quantized/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/sparse/quantized/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/sparse/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/sparse/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/modules/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/modules/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/modules/sparse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/modules/sparse.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/reference/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/reference/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/normalization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/normalization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/functional_modules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/functional_modules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/embedding_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/embedding_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/dropout.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/dropout.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/batchnorm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/batchnorm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/activation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/activation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/dynamic/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/dynamic/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/dynamic/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/dynamic/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/dynamic/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/dynamic/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/dynamic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/dynamic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantized/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantized/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantizable/modules/rnn.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantizable/modules/rnn.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantizable/modules/activation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantizable/modules/activation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantizable/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantizable/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/quantizable/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/quantizable/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/modules/embedding_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/modules/embedding_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/modules/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/modules/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/dynamic/modules/linear.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/dynamic/modules/linear.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/dynamic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/dynamic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/qat/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/qat/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/modules/linear_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/modules/linear_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/modules/conv_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/modules/conv_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/modules/conv_add.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/modules/conv_add.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/modules/bn_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/modules/bn_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/dynamic/modules/linear_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/dynamic/modules/linear_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/dynamic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/dynamic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/dynamic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/dynamic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/quantized/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/quantized/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/qat/modules/linear_relu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/qat/modules/linear_relu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/qat/modules/linear_fused.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/qat/modules/linear_fused.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/qat/modules/conv_fused.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/qat/modules/conv_fused.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/qat/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/qat/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/qat/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/qat/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/modules/fused.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/modules/fused.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/modules/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/modules/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/intrinsic/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/intrinsic/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/nn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/nn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/ao/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/ao/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/amp/grad_scaler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/amp/grad_scaler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/amp/autocast_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/amp/autocast_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/amp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/amp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_weights_only_unpickler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_weights_only_unpickler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_vmap_internals.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_vmap_internals.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_vendor/packaging/version.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_vendor/packaging/version.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_vendor/packaging/_structures.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_vendor/packaging/_structures.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_vendor/packaging/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_vendor/packaging/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_vendor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_vendor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_utils_internal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_utils_internal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_torch_docs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_torch_docs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_tensor_str.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_tensor_str.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_tensor_docs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_tensor_docs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/schema_check_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/schema_check_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/meta_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/meta_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/functional_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/functional_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/fake_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/fake_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/fake_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/fake_tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/fake_impls.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/fake_impls.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_subclasses/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_subclasses/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_streambase.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_streambase.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_storage_docs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_storage_docs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_sources.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_sources.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/special/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/special/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/nn/functional/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/nn/functional/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/nn/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/nn/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/linalg/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/linalg/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/fft.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/fft.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/_conversions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/_conversions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_refs/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_refs/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_python_dispatcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_python_dispatcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims_common/wrappers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims_common/wrappers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims_common/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims_common/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims/rng_prims.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims/rng_prims.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims/executor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims/executor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims/debug_prims.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims/debug_prims.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims/context.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims/context.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_prims/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_prims/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/testing/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/testing/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/testing/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/testing/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/random.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/random.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/linalg.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/linalg.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/fft.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/fft.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_util.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_util.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_unary_ufuncs_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_unary_ufuncs_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_ufuncs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_ufuncs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_reductions_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_reductions_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_normalizations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_normalizations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_ndarray.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_ndarray.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_getlimits.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_getlimits.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_funcs_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_funcs_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_funcs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_funcs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_dtypes_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_dtypes_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_dtypes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_dtypes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_casting_dicts.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_casting_dicts.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/_binary_ufuncs_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/_binary_ufuncs_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_numpy/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_numpy/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_namedtensor_internals.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_namedtensor_internals.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_meta_registrations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_meta_registrations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lowrank.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lowrank.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_logging/structured.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_logging/structured.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_logging/_registrations.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_logging/_registrations.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_logging/_internal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_logging/_internal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_logging/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_logging/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lobpcg.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lobpcg.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_linalg_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_linalg_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/simple_registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/simple_registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/fake_class_registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/fake_class_registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/custom_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/custom_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/autograd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/autograd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/abstract_impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/abstract_impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_library/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_library/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/ts_backend.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/ts_backend.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/tensor_factory_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/tensor_factory_functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/metrics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/metrics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/ir_cache.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/ir_cache.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/extract_compiled_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/extract_compiled_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/device_context.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/device_context.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/debug.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/debug.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/computation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/computation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/closure.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/closure.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_lazy/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_lazy/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_jit_internal.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_jit_internal.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/wrapper_benchmark.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/wrapper_benchmark.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/virtualized.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/virtualized.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/triton_heuristics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/triton_heuristics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/triton_helpers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/triton_helpers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/test_operators.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/test_operators.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/test_case.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/test_case.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/sizevars.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/sizevars.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/select_algorithm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/select_algorithm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/scheduler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/scheduler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/quantized_lowerings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/quantized_lowerings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/pattern_matcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/pattern_matcher.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/optimize_indexing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/optimize_indexing.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/ops_handler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/ops_handler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/mkldnn_lowerings.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/mkldnn_lowerings.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/metrics.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/metrics.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/lowering.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/lowering.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/unpack_mixed_mm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/unpack_mixed_mm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/templated_attention.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/templated_attention.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/mm_plus_mm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/mm_plus_mm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/mm_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/mm_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/mm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/mm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/conv.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/conv.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/bmm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/bmm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/kernel/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/kernel/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/ir.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/ir.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/inductor_prims.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/inductor_prims.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/index_propagation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/index_propagation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/split_cat.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/split_cat.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/mm_pattern.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/mm_pattern.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/bmm_pattern.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/bmm_pattern.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/addmm_pattern.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/addmm_pattern.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_9.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_9.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_8.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_8.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_7.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_7.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_6.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_6.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_5.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_5.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_4.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_4.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_3.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_3.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_2.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_2.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_18.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_18.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_17.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_17.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_16.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_16.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_15.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_15.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_14.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_14.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_13.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_13.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_12.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_12.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_11.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_11.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_10.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_10.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_1.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/_sfdp_pattern_1.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/serialized_patterns/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/serialized_patterns/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/replace_random.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/replace_random.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/reinplace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/reinplace.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/quantization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/quantization.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/pre_grad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/pre_grad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/post_grad.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/post_grad.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/pad_mm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/pad_mm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/numeric_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/numeric_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/mkldnn_fusion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/mkldnn_fusion.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/misc_patterns.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/misc_patterns.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/joint_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/joint_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/group_batch_fusion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/group_batch_fusion.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/fuse_attention.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/fuse_attention.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/freezing_patterns.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/freezing_patterns.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/efficient_conv_bn_eval.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/efficient_conv_bn_eval.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/dedupe_symint_uses.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/dedupe_symint_uses.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/decompose_mem_bound_mm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/decompose_mem_bound_mm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/ddp_fusion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/ddp_fusion.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/binary_folding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/binary_folding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/fx_passes/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/fx_passes/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/freezing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/freezing.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/exc.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/exc.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/dependencies.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/dependencies.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/decomposition.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/decomposition.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/debug.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/debug.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/cudagraph_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/cudagraph_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/cudagraph_trees.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/cudagraph_trees.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/coordinate_descent_tuner.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/coordinate_descent_tuner.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/constant_folding.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/constant_folding.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/compile_fx.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/compile_fx.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/comms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/comms.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/comm_analysis.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/comm_analysis.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/xpu/device_op_overrides.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/xpu/device_op_overrides.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/xpu/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/xpu/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/wrapper.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/wrapper.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/triton_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/triton_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/triton_split_scan.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/triton_split_scan.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/triton_foreach.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/triton_foreach.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/triton.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/triton.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/multi_kernel.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/multi_kernel.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/memory_planning.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/memory_planning.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda_combined_scheduling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda_combined_scheduling.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/gemm_template.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/gemm_template.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/device_op_overrides.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/device_op_overrides.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cutlass_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cutlass_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cutlass_lib_extensions/gemm_operation_extensions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cutlass_lib_extensions/gemm_operation_extensions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cutlass_lib_extensions/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cutlass_lib_extensions/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cutlass_epilogue_gen.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cutlass_epilogue_gen.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cuda_template.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cuda_template.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cuda_kernel.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cuda_kernel.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cuda_env.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cuda_env.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/cuda_cpp_scheduling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/cuda_cpp_scheduling.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cuda/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cuda/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cpp_wrapper_cuda.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cpp_wrapper_cuda.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cpp_wrapper_cpu.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cpp_wrapper_cpu.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/cpp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/cpp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codegen/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codegen/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/codecache.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/codecache.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/bounds.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/bounds.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/autotune_process.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/autotune_process.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_inductor/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_inductor/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/wrap.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/wrap.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/while_loop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/while_loop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/triton_kernel_wrap.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/triton_kernel_wrap.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/torchbind.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/torchbind.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/templated_attention.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/templated_attention.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/strict_mode.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/strict_mode.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/out_dtype.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/out_dtype.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/map.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/map.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/effects.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/effects.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/cond.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/cond.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/auto_functionalize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/auto_functionalize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_higher_order_ops/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_higher_order_ops/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_guards.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_guards.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/vmap.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/vmap.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/top_operators_github_usage.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/top_operators_github_usage.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/pytree_hacks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/pytree_hacks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/python_key.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/python_key.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/pyfunctorch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/pyfunctorch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/partitioners.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/partitioners.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/make_functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/make_functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/fx_minifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/fx_minifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/functional_call.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/functional_call.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/eager_transforms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/eager_transforms.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/deprecated.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/deprecated.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/compilers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/compilers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/compile_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/compile_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/benchmark_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/benchmark_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/batch_norm_replacement.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/batch_norm_replacement.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/autograd_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/autograd_function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/apis.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/apis.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/aot_autograd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/aot_autograd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/traced_function_transforms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/traced_function_transforms.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/subclass_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/subclass_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/schemas.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/schemas.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/runtime_wrappers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/runtime_wrappers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/logging_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/logging_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/input_output_analysis.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/input_output_analysis.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/functional_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/functional_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/collect_metadata_analysis.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/collect_metadata_analysis.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/_aot_autograd/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/_aot_autograd/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_functorch/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_functorch/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/wrappers.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/wrappers.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/verifier.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/verifier.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/serde/upgrade.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/serde/upgrade.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/serde/union.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/serde/union.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/serde/serialize.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/serde/serialize.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/serde/schema_check.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/serde/schema_check.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/serde/schema.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/serde/schema.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/serde/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/serde/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/replace_view_ops_with_view_copy_ops_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/replace_view_ops_with_view_copy_ops_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/replace_sym_size_ops_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/replace_sym_size_ops_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/replace_set_grad_with_hop_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/replace_set_grad_with_hop_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/remove_runtime_assertions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/remove_runtime_assertions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/lift_constants_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/lift_constants_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/functionalize_side_effectful_ops_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/functionalize_side_effectful_ops_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/collect_tracepoints_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/collect_tracepoints_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/add_runtime_assertions_for_constraints_pass.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/add_runtime_assertions_for_constraints_pass.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/passes/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/passes/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/pass_infra/proxy_value.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/pass_infra/proxy_value.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/pass_infra/node_metadata.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/pass_infra/node_metadata.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/pass_infra/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/pass_infra/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/pass_base.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/pass_base.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/non_strict_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/non_strict_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/exported_program.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/exported_program.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/error.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/error.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/logging.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/logging.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/gen_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/gen_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/user_input_mutation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/user_input_mutation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/type_reflection_method.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/type_reflection_method.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/torch_sym_min.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/torch_sym_min.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/tensor_setattr.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/tensor_setattr.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/static_if.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/static_if.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/static_for_loop.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/static_for_loop.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/specialized_attribute.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/specialized_attribute.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/scalar_output.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/scalar_output.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/pytree_flatten.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/pytree_flatten.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/optional_input.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/optional_input.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/null_context_manager.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/null_context_manager.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/nested_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/nested_function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/model_attr_mutation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/model_attr_mutation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/list_unpack.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/list_unpack.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/list_contains.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/list_contains.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/fn_with_kwargs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/fn_with_kwargs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_view.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_view.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_slicing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_slicing.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_round.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_round.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_map.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_map.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_if_guard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_if_guard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_constructor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_constructor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dynamic_shape_assert.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dynamic_shape_assert.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/dictionary.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/dictionary.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/decorator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/decorator.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/constrain_as_value_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/constrain_as_value_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/constrain_as_size_example.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/constrain_as_size_example.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/cond_predicate.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/cond_predicate.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/cond_operands.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/cond_operands.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/cond_closed_over_variable.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/cond_closed_over_variable.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/cond_branch_nonlocal_variables.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/cond_branch_nonlocal_variables.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/cond_branch_nested_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/cond_branch_nested_function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/cond_branch_class_method.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/cond_branch_class_method.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/class_method.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/class_method.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/autograd_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/autograd_function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/assume_constant_result.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/assume_constant_result.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/examples/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/examples/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/case.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/case.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/db/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/db/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_export/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_export/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/user_defined.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/user_defined.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/torch_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/torch_function.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/torch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/torch.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/tensor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/sdpa.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/sdpa.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/optimizer.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/optimizer.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/nn_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/nn_module.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/misc.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/misc.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/lists.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/lists.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/lazy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/lazy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/iter.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/iter.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/higher_order_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/higher_order_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/functions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/distributed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/distributed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/dicts.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/dicts.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/ctx_manager.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/ctx_manager.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/constant.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/constant.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/builtin.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/builtin.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/builder.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/builder.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/base.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/base.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/variables/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/variables/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/types.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/trace_rules.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/trace_rules.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/testing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/testing.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/test_minifier_common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/test_minifier_common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/test_case.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/test_case.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/tensor_version_op.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/tensor_version_op.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/symbolic_convert.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/symbolic_convert.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/source.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/source.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/side_effects.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/side_effects.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/resume_execution.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/resume_execution.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/repro/after_dynamo.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/repro/after_dynamo.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/repro/after_aot.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/repro/after_aot.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/repro/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/repro/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/replay_record.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/replay_record.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/profiler.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/profiler.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/polyfill.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/polyfill.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/output_graph.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/output_graph.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/mutation_guard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/mutation_guard.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/logging.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/logging.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/hooks.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/hooks.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/guards.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/guards.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/funcname_cache.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/funcname_cache.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/external_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/external_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/exc.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/exc.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/eval_frame.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/eval_frame.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/device_interface.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/device_interface.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/decorators.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/decorators.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/debug_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/debug_utils.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/current_scope_id.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/current_scope_id.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/create_parameter_op.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/create_parameter_op.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/convert_frame.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/convert_frame.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/config.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/comptime.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/comptime.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/compiled_autograd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/compiled_autograd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/codegen.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/codegen.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/code_context.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/code_context.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/callback.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/callback.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/cache_size.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/cache_size.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/bytecode_transformation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/bytecode_transformation.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/bytecode_analysis.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/bytecode_analysis.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/tvm.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/tvm.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/torchxla.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/torchxla.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/tensorrt.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/tensorrt.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/registry.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/registry.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/onnxrt.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/onnxrt.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/inductor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/inductor.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/distributed.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/distributed.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/debugging.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/debugging.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/cudagraphs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/cudagraphs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/common.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/common.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/backends/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/backends/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/_trace_wrapped_higher_order_op.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/_trace_wrapped_higher_order_op.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dynamo/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dynamo/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dispatch/python.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dispatch/python.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_dispatch/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_dispatch/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_deploy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_deploy.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_decomp/decompositions_for_rng.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_decomp/decompositions_for_rng.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_decomp/decompositions_for_jvp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_decomp/decompositions_for_jvp.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_decomp/decompositions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_decomp/decompositions.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_decomp/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_decomp/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_custom_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_custom_ops.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_custom_op/impl.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_custom_op/impl.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_custom_op/functional.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_custom_op/functional.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_custom_op/autograd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_custom_op/autograd.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_custom_op/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_custom_op/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_compile.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_compile.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_classes.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_classes.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_awaits/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_awaits/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_appdirs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_appdirs.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/__init__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/__future__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/__future__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/__config__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/__config__.py + for f in `find ./torch/ -name '*.py'` + install -D -pm 644 ./torch/_VF.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torch/_VF.py ++ find ./torchgen/ -name '*.py' + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/yaml_utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/yaml_utils.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/utils.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/utils.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/static_runtime/generator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/static_runtime/generator.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/static_runtime/gen_static_runtime_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/static_runtime/gen_static_runtime_ops.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/static_runtime/config.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/static_runtime/config.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/static_runtime/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/static_runtime/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/shape_functions/gen_jit_shape_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/shape_functions/gen_jit_shape_functions.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/selective_build/selector.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/selective_build/selector.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/selective_build/operator.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/selective_build/operator.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/selective_build/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/selective_build/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/operator_versions/gen_mobile_upgraders_constant.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/operator_versions/gen_mobile_upgraders_constant.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/operator_versions/gen_mobile_upgraders.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/operator_versions/gen_mobile_upgraders.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/operator_versions/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/operator_versions/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/native_function_generation.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/native_function_generation.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/model.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/model.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/local.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/local.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen_vmap_plumbing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen_vmap_plumbing.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen_lazy_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen_lazy_tensor.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen_functionalization_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen_functionalization_type.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen_executorch.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen_executorch.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen_backend_stubs.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen_backend_stubs.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen_aoti_c_shim.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen_aoti_c_shim.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/gen.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/gen.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/fuse/gen_patterns.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/fuse/gen_patterns.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/parse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/parse.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/model.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/model.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/unboxing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/unboxing.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/types/types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/types/types.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/types/signatures.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/types/signatures.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/types/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/types/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/et_cpp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/et_cpp.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/custom_ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/custom_ops.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/api/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/api/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/executorch/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/executorch/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/dest/ufunc.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/dest/ufunc.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/dest/register_dispatch_key.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/dest/register_dispatch_key.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/dest/native_functions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/dest/native_functions.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/dest/lazy_ts_lowering.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/dest/lazy_ts_lowering.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/dest/lazy_ir.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/dest/lazy_ir.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/dest/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/dest/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/decompositions/gen_jit_decompositions.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/decompositions/gen_jit_decompositions.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/context.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/context.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/code_template.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/code_template.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/unboxing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/unboxing.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/ufunc.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/ufunc.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/types/types_base.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/types/types_base.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/types/types.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/types/types.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/types/signatures.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/types/signatures.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/types/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/types/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/translate.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/translate.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/structured.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/structured.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/python.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/python.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/native.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/native.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/meta.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/meta.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/lazy.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/lazy.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/functionalization.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/functionalization.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/dispatcher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/dispatcher.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/cpp.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/cpp.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/autograd.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/autograd.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/api/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/api/__init__.py + for f in `find ./torchgen/ -name '*.py'` + install -D -pm 644 ./torchgen/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./torchgen/__init__.py ++ find ./functorch/ -name '*.py' + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/op_analysis/gen_data.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/op_analysis/gen_data.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/notebooks/_src/plot_per_sample_gradients.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/notebooks/_src/plot_per_sample_gradients.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/notebooks/_src/plot_jacobians_and_hessians.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/notebooks/_src/plot_jacobians_and_hessians.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/notebooks/_src/plot_ensembling.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/notebooks/_src/plot_ensembling.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/experimental/ops.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/experimental/ops.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/experimental/control_flow.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/experimental/control_flow.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/experimental/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/experimental/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_regression/evjang_transforms_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_regression/evjang_transforms_module.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_regression/evjang_transforms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_regression/evjang_transforms.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_regression/evjang.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_regression/evjang.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_omniglot/support/omniglot_loaders.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_omniglot/support/omniglot_loaders.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_omniglot/maml-omniglot-transforms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_omniglot/maml-omniglot-transforms.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_omniglot/maml-omniglot-ptonly.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_omniglot/maml-omniglot-ptonly.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/maml_omniglot/maml-omniglot-higher.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/maml_omniglot/maml-omniglot-higher.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/lennard_jones/lennard_jones.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/lennard_jones/lennard_jones.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/ensembling/parallel_train.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/ensembling/parallel_train.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/dp_cifar10/cifar10_transforms.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/dp_cifar10/cifar10_transforms.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/dp_cifar10/cifar10_opacus.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/dp_cifar10/cifar10_opacus.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/compilation/simple_function.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/compilation/simple_function.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/compilation/linear_train.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/compilation/linear_train.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/compilation/fuse_module.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/compilation/fuse_module.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/examples/compilation/eager_fusion.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/examples/compilation/eager_fusion.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/einops/rearrange.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/einops/rearrange.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/einops/_parsing.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/einops/_parsing.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/einops/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/einops/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/docs/source/conf.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/docs/source/conf.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/wrap_type.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/wrap_type.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/tree_map.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/tree_map.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/reference.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/reference.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/op_properties.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/op_properties.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/magic_trace.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/magic_trace.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/dim.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/dim.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/delayed_mul_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/delayed_mul_tensor.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/batch_tensor.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/batch_tensor.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/dim/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/dim/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/compile/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/compile/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/benchmarks/process_scorecard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/benchmarks/process_scorecard.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/benchmarks/pointwise_scorecard.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/benchmarks/pointwise_scorecard.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/benchmarks/per_sample_grads.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/benchmarks/per_sample_grads.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/benchmarks/operator_authoring.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/benchmarks/operator_authoring.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/benchmarks/cse.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/benchmarks/cse.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/benchmarks/chrome_trace_parser.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/benchmarks/chrome_trace_parser.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/_src/vmap/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/_src/vmap/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/_src/make_functional/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/_src/make_functional/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/_src/eager_transforms/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/_src/eager_transforms/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/_src/aot_autograd/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/_src/aot_autograd/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/_src/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/_src/__init__.py + for f in `find ./functorch/ -name '*.py'` + install -D -pm 644 ./functorch/__init__.py /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/python3.11/site-packages/./functorch/__init__.py ++ /usr/local/cuda/bin/nvcc --version ++ grep release ++ awk '{print $2}' ++ cut -d, -f2 + cuver=12.3 + echo 'from typing import Optional' + echo '__all__ = ['\''__version__'\'', '\''debug'\'', '\''cuda'\'', '\''git_version'\'', '\''hip'\'']' + echo '__version__ = '\''2.4.0'\''' + echo 'debug = False' + echo 'cuda: Optional[str] = '\''12.3'\''' + echo 'git_version = '\''7efaf54dc46034189cb36b345764a5a9a5b693d4'\''' + echo 'hip: Optional[str] = None' + mv -f /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//builddir/build/BUILD/pytorch/nvfuser/nvfuser.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/ mv: cannot stat '/builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//builddir/build/BUILD/pytorch/nvfuser/nvfuser.so': No such file or directory + true + mv -f /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//builddir/build/BUILD/pytorch/torch/lib/libnvfuser_codegen.so /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/ mv: cannot stat '/builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//builddir/build/BUILD/pytorch/torch/lib/libnvfuser_codegen.so': No such file or directory + true + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/include/fmt + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/include/clog.h + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/include/xnnpack.h + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//builddir/build/BUILD/pytorch/test + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//builddir/build/BUILD/pytorch/nvfuser + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/cmake/fmt + rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64//usr/lib64/pkgconfig/fmt.pc + find /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 -name functorch.so -exec rm -f '{}' ';' + /usr/bin/python3 setup.py egg_info Building wheel torch-2.4.0a0+git7efaf54 running egg_info creating torch.egg-info writing torch.egg-info/PKG-INFO writing dependency_links to torch.egg-info/dependency_links.txt writing entry points to torch.egg-info/entry_points.txt writing requirements to torch.egg-info/requires.txt writing top-level names to torch.egg-info/top_level.txt writing manifest file 'torch.egg-info/SOURCES.txt' reading manifest file 'torch.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no previously-included files matching '*.o' found anywhere in distribution warning: no previously-included files matching '*.so' found anywhere in distribution warning: no previously-included files matching '*.dylib' found anywhere in distribution warning: no previously-included files matching '*.a' found anywhere in distribution warning: no previously-included files matching '*.swp' found anywhere in distribution adding license file 'LICENSE' adding license file 'NOTICE' writing manifest file 'torch.egg-info/SOURCES.txt' + cp -r torch.egg-info /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/python3.11/site-packages/ + sed -i '/^\[/!s/[<=>].*//g' /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/python3.11/site-packages/torch.egg-info/requires.txt + sed -i /triton/d /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/python3.11/site-packages/torch.egg-info/requires.txt + set +x Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/bin/torch_shm_manager Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libc10.so.2.4.0 Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libc10_cuda.so Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libcaffe2_nvrtc.so Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libnnapi_backend.so Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libshm.so.2.4.0 Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libtorch.so.2.4.0 Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libtorch_cpu.so.2.4.0 Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libtorch_cuda.so Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libtorch_cuda_linalg.so Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libtorch_global_deps.so.2.4.0 Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/libtorch_python.so.2.4.0 Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/python3.11/site-packages/functorch/_C.so Stripping: /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/python3.11/site-packages/torch/_C.so + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/brp-strip /usr/bin/strip + /usr/lib/rpm/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/brp-strip-static-archive /usr/bin/strip + /usr/lib/rpm/check-rpaths + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j4 Bytecompiling .py files below /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/lib64/python3.11 using python3.11 + /usr/lib/rpm/redhat/brp-python-hardlink Processing files: pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 Executing(%doc): /bin/sh -e /var/tmp/rpm-tmp.apHW9j + umask 022 + cd /builddir/build/BUILD + cd pytorch + DOCDIR=/builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/doc/pytorch + export LC_ALL=C + LC_ALL=C + export DOCDIR + /usr/bin/mkdir -p /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/doc/pytorch + cp -pr README.md /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/doc/pytorch + cp -pr CONTRIBUTING.md /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/doc/pytorch + RPM_EC=0 ++ jobs -p + exit 0 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.XGu8vR + umask 022 + cd /builddir/build/BUILD + cd pytorch + LICENSEDIR=/builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/licenses/pytorch + export LC_ALL=C + LC_ALL=C + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/licenses/pytorch + cp -pr LICENSE /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64/usr/share/licenses/pytorch + RPM_EC=0 ++ jobs -p + exit 0 Provides: libc10.so.2.4()(64bit) libc10_cuda.so()(64bit) libcaffe2_nvrtc.so()(64bit) libnnapi_backend.so()(64bit) libshm.so.2.4()(64bit) libtorch.so.2.4()(64bit) libtorch_cpu.so.2.4()(64bit) libtorch_cuda.so()(64bit) libtorch_cuda_linalg.so()(64bit) libtorch_global_deps.so.2.4()(64bit) pytorch = 2.4.0-20240412.0.git7efaf54d.cu12_3.fc38 pytorch(x86-64) = 2.4.0-20240412.0.git7efaf54d.cu12_3.fc38 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: ld-linux-x86-64.so.2()(64bit) ld-linux-x86-64.so.2(GLIBC_2.3)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.11)(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.16)(64bit) libc.so.6(GLIBC_2.17)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.28)(64bit) libc.so.6(GLIBC_2.3)(64bit) libc.so.6(GLIBC_2.3.2)(64bit) libc.so.6(GLIBC_2.3.4)(64bit) libc.so.6(GLIBC_2.32)(64bit) libc.so.6(GLIBC_2.33)(64bit) libc.so.6(GLIBC_2.34)(64bit) libc.so.6(GLIBC_2.6)(64bit) libc10.so.2.4()(64bit) libc10_cuda.so()(64bit) libcpuinfo.so.1()(64bit) libcublas.so.12()(64bit) libcublas.so.12(libcublas.so.12)(64bit) libcublasLt.so.12()(64bit) libcublasLt.so.12(libcublasLt.so.12)(64bit) libcuda.so.1()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libcudnn.so.8()(64bit) libcudnn.so.8(libcudnn.so.8)(64bit) libcufft.so.11()(64bit) libcufft.so.11(libcufft.so.11)(64bit) libcurand.so.10()(64bit) libcusolver.so.11()(64bit) libcusolver.so.11(libcusolver.so.11)(64bit) libcusparse.so.12()(64bit) libcusparse.so.12(libcusparse.so.12)(64bit) libfbgemm.so.1()(64bit) libfoxi_loader.so.1()(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libgcc_s.so.1(GCC_3.4)(64bit) libgflags.so.2.2()(64bit) libglog.so.0()(64bit) libgloo.so.1()(64bit) libgloo_cuda.so.1()(64bit) libgomp.so.1()(64bit) libgomp.so.1(GOMP_4.0)(64bit) libgomp.so.1(OMP_1.0)(64bit) libhiredis.so.1.0.0()(64bit) libkineto.so.1()(64bit) libleveldb.so.1()(64bit) liblmdb.so.0.0.0()(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.2.5)(64bit) libm.so.6(GLIBC_2.23)(64bit) libm.so.6(GLIBC_2.27)(64bit) libm.so.6(GLIBC_2.29)(64bit) libm.so.6(GLIBC_2.35)(64bit) libmagma.so.1()(64bit) libnccl.so.2()(64bit) libnnpack.so.1()(64bit) libnuma.so.1()(64bit) libnuma.so.1(libnuma_1.1)(64bit) libnuma.so.1(libnuma_1.2)(64bit) libnvToolsExt.so.1()(64bit) libnvToolsExt.so.1(libnvToolsExt.so.1)(64bit) libnvrtc.so.12()(64bit) libnvrtc.so.12(libnvrtc.so.12)(64bit) libonnx.so()(64bit) libonnx_optimizer.so()(64bit) libonnx_proto.so()(64bit) libopenblaso.so.0()(64bit) libopencv_calib3d.so.409()(64bit) libopencv_core.so.409()(64bit) libopencv_cudev.so.409()(64bit) libopencv_dnn.so.409()(64bit) libopencv_features2d.so.409()(64bit) libopencv_flann.so.409()(64bit) libopencv_highgui.so.409()(64bit) libopencv_imgcodecs.so.409()(64bit) libopencv_imgproc.so.409()(64bit) libopencv_optflow.so.409()(64bit) libopencv_video.so.409()(64bit) libopencv_videoio.so.409()(64bit) libopencv_ximgproc.so.409()(64bit) libprotobuf.so.32()(64bit) libpthreadpool.so.1()(64bit) libqnnpack.so.1()(64bit) libshm.so.2.4()(64bit) libsleef.so.3()(64bit) libsnappy.so.1()(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.11)(64bit) libstdc++.so.6(CXXABI_1.3.13)(64bit) libstdc++.so.6(CXXABI_1.3.2)(64bit) libstdc++.so.6(CXXABI_1.3.3)(64bit) libstdc++.so.6(CXXABI_1.3.5)(64bit) libstdc++.so.6(CXXABI_1.3.7)(64bit) libstdc++.so.6(CXXABI_1.3.8)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.14)(64bit) libstdc++.so.6(GLIBCXX_3.4.15)(64bit) libstdc++.so.6(GLIBCXX_3.4.17)(64bit) libstdc++.so.6(GLIBCXX_3.4.18)(64bit) libstdc++.so.6(GLIBCXX_3.4.19)(64bit) libstdc++.so.6(GLIBCXX_3.4.20)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.22)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) libtensorpipe.so.1()(64bit) libtensorpipe_cuda.so.1()(64bit) libtorch.so.2.4()(64bit) libtorch_cpu.so.2.4()(64bit) libtorch_cuda.so()(64bit) libtorch_python.so.2.4()(64bit) libzmq.so.5()(64bit) rtld(GNU_HASH) Processing files: pytorch-devel-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 Provides: cmake(ATen) cmake(Caffe2) cmake(Torch) = 2.4.0 cmake(aten) cmake(caffe2) cmake(torch) = 2.4.0 pytorch-devel = 2.4.0-20240412.0.git7efaf54d.cu12_3.fc38 pytorch-devel(x86-64) = 2.4.0-20240412.0.git7efaf54d.cu12_3.fc38 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: cmake-filesystem libc10.so.2.4()(64bit) libshm.so.2.4()(64bit) libtorch.so.2.4()(64bit) libtorch_cpu.so.2.4()(64bit) libtorch_global_deps.so.2.4()(64bit) Processing files: pytorch-python3-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 warning: absolute symlink: /usr/lib64/python3.11/site-packages/torch/bin/torch_shm_manager -> /usr/bin/torch_shm_manager warning: absolute symlink: /usr/lib64/python3.11/site-packages/torch/include -> /usr/include warning: absolute symlink: /usr/lib64/python3.11/site-packages/torch/lib -> /usr/lib64 Provides: libtorch_python.so.2.4()(64bit) python3.11dist(torch) = 2.4.0 python3.11dist(torch) = 2.4~a0 python3dist(torch) = 2.4~a0 pytorch-python3 = 2.4.0-20240412.0.git7efaf54d.cu12_3.fc38 pytorch-python3(x86-64) = 2.4.0-20240412.0.git7efaf54d.cu12_3.fc38 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PartialHardlinkSets) <= 4.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: ld-linux-x86-64.so.2()(64bit) ld-linux-x86-64.so.2(GLIBC_2.3)(64bit) libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.16)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_2.3.2)(64bit) libc.so.6(GLIBC_2.3.4)(64bit) libc.so.6(GLIBC_2.32)(64bit) libc.so.6(GLIBC_2.34)(64bit) libc10.so.2.4()(64bit) libc10_cuda.so()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libcudnn.so.8()(64bit) libcudnn.so.8(libcudnn.so.8)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libgcc_s.so.1(GCC_3.4)(64bit) libglog.so.0()(64bit) libnvToolsExt.so.1()(64bit) libnvToolsExt.so.1(libnvToolsExt.so.1)(64bit) libprotobuf.so.32()(64bit) libshm.so.2.4()(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.11)(64bit) libstdc++.so.6(CXXABI_1.3.13)(64bit) libstdc++.so.6(CXXABI_1.3.2)(64bit) libstdc++.so.6(CXXABI_1.3.3)(64bit) libstdc++.so.6(CXXABI_1.3.5)(64bit) libstdc++.so.6(CXXABI_1.3.8)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.14)(64bit) libstdc++.so.6(GLIBCXX_3.4.15)(64bit) libstdc++.so.6(GLIBCXX_3.4.18)(64bit) libstdc++.so.6(GLIBCXX_3.4.19)(64bit) libstdc++.so.6(GLIBCXX_3.4.20)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.22)(64bit) libstdc++.so.6(GLIBCXX_3.4.26)(64bit) libstdc++.so.6(GLIBCXX_3.4.29)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) libstdc++.so.6(GLIBCXX_3.4.9)(64bit) libtorch.so.2.4()(64bit) libtorch_cpu.so.2.4()(64bit) libtorch_cuda.so()(64bit) libtorch_python.so.2.4()(64bit) python(abi) = 3.11 python3.11dist(filelock) python3.11dist(fsspec) python3.11dist(jinja2) python3.11dist(networkx) python3.11dist(sympy) python3.11dist(typing-extensions) rtld(GNU_HASH) Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 Wrote: /builddir/build/RPMS/pytorch-devel-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64.rpm Wrote: /builddir/build/RPMS/pytorch-python3-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64.rpm Wrote: /builddir/build/RPMS/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64.rpm Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.d9loym + umask 022 + cd /builddir/build/BUILD + cd pytorch + /usr/bin/rm -rf /builddir/build/BUILDROOT/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.x86_64 + RPM_EC=0 ++ jobs -p + exit 0 Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.9bCHcj + umask 022 + cd /builddir/build/BUILD + rm -rf pytorch pytorch.gemspec + RPM_EC=0 ++ jobs -p + exit 0 RPM build warnings: %patchN is deprecated (2 usages found), use %patch N (or %patch -P N) absolute symlink: /usr/lib64/python3.11/site-packages/torch/bin/torch_shm_manager -> /usr/bin/torch_shm_manager absolute symlink: /usr/lib64/python3.11/site-packages/torch/include -> /usr/include absolute symlink: /usr/lib64/python3.11/site-packages/torch/lib -> /usr/lib64 Finish: rpmbuild pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm Finish: build phase for pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm INFO: chroot_scan: 3 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log/dnf.rpm.log /var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log/dnf.librepo.log /var/lib/mock/fedora-38-x86_64-1712885354.239135/root/var/log/dnf.log INFO: Done(/var/lib/copr-rpmbuild/results/pytorch-2.4.0-20240412.0.git7efaf54d.cu12_3.fc38.src.rpm) Config(child) 467 minutes 37 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "pytorch", "epoch": null, "version": "2.4.0", "release": "20240412.0.git7efaf54d.cu12_3.fc38", "arch": "x86_64" }, { "name": "pytorch-python3", "epoch": null, "version": "2.4.0", "release": "20240412.0.git7efaf54d.cu12_3.fc38", "arch": "x86_64" }, { "name": "pytorch-devel", "epoch": null, "version": "2.4.0", "release": "20240412.0.git7efaf54d.cu12_3.fc38", "arch": "x86_64" }, { "name": "pytorch", "epoch": null, "version": "2.4.0", "release": "20240412.0.git7efaf54d.cu12_3.fc38", "arch": "src" } ] } RPMResults finished