Warning: Permanently added '54.162.7.100' (ED25519) to the list of known hosts. You can reproduce this build on your computer by running: sudo dnf install copr-rpmbuild /usr/bin/copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.fedorainfracloud.org/backend/get-build-task/9640922-fedora-43-x86_64 --chroot fedora-43-x86_64 Version: 1.6 PID: 8534 Logging PID: 8536 Task: {'allow_user_ssh': False, 'appstream': False, 'background': False, 'build_id': 9640922, 'buildroot_pkgs': [], 'chroot': 'fedora-43-x86_64', 'enable_net': False, 'fedora_review': False, 'git_hash': 'd303daa4126ca907fd5db0a5f5e8d3715a737765', 'git_repo': 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda', 'isolation': 'default', 'memory_reqs': 2048, 'package_name': 'ollama-ggml-cuda', 'package_version': '0.12.3-1', 'project_dirname': 'ollama', 'project_name': 'ollama', 'project_owner': 'fachep', 'repo_priority': None, 'repos': [{'baseurl': 'https://download.copr.fedorainfracloud.org/results/fachep/ollama/fedora-43-x86_64/', 'id': 'copr_base', 'name': 'Copr repository', 'priority': None}, {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora42/x86_64/', 'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64', 'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64'}, {'baseurl': 'https://developer.download.nvidia.cn/compute/cuda/repos/fedora41/x86_64/', 'id': 'https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64', 'name': 'Additional repo https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64'}], 'sandbox': 'fachep/ollama--fachep', 'source_json': {}, 'source_type': None, 'ssh_public_keys': None, 'storage': 0, 'submitter': 'fachep', 'tags': [], 'task_id': '9640922-fedora-43-x86_64', 'timeout': 18000, 'uses_devel_repo': False, 'with_opts': [], 'without_opts': []} Running: git clone https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda --depth 500 --no-single-branch --recursive cmd: ['git', 'clone', 'https://copr-dist-git.fedorainfracloud.org/git/fachep/ollama/ollama-ggml-cuda', '/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda', '--depth', '500', '--no-single-branch', '--recursive'] cwd: . rc: 0 stdout: stderr: Cloning into '/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda'... Running: git checkout d303daa4126ca907fd5db0a5f5e8d3715a737765 -- cmd: ['git', 'checkout', 'd303daa4126ca907fd5db0a5f5e8d3715a737765', '--'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda rc: 0 stdout: stderr: Note: switching to 'd303daa4126ca907fd5db0a5f5e8d3715a737765'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example: git switch -c Or undo this operation with: git switch - Turn off this advice by setting config variable advice.detachedHead to false HEAD is now at d303daa automatic import of ollama-ggml-cuda Running: dist-git-client sources cmd: ['dist-git-client', 'sources'] cwd: /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda rc: 0 stdout: stderr: INFO: Reading stdout from command: git rev-parse --abbrev-ref HEAD INFO: Reading stdout from command: git rev-parse HEAD INFO: Reading sources specification file: sources INFO: Downloading v0.12.3.tar.gz INFO: Reading stdout from command: curl --help all INFO: Calling: curl -H Pragma: -o v0.12.3.tar.gz --location --connect-timeout 60 --retry 3 --retry-delay 10 --remote-time --show-error --fail --retry-all-errors https://copr-dist-git.fedorainfracloud.org/repo/pkgs/fachep/ollama/ollama-ggml-cuda/v0.12.3.tar.gz/md5/f096acee5e82596e9afd4d07ed477de2/v0.12.3.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 10.5M 100 10.5M 0 0 163M 0 --:--:-- --:--:-- --:--:-- 161M INFO: Reading stdout from command: md5sum v0.12.3.tar.gz tail: /var/lib/copr-rpmbuild/main.log: file truncated Running (timeout=18000): unbuffer mock --spec /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759434727.591343 -r /var/lib/copr-rpmbuild/results/configs/child.cfg INFO: mock.py version 6.3 starting (python version = 3.13.7, NVR = mock-6.3-1.fc42), args: /usr/libexec/mock/mock --spec /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec --sources /var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda --resultdir /var/lib/copr-rpmbuild/results --uniqueext 1759434727.591343 -r /var/lib/copr-rpmbuild/results/configs/child.cfg Start(bootstrap): init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish(bootstrap): init plugins Start: init plugins INFO: tmpfs initialized INFO: selinux enabled INFO: chroot_scan: initialized INFO: compress_logs: initialized Finish: init plugins INFO: Signal handler active Start: run INFO: Start(/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec) Config(fedora-43-x86_64) Start: clean chroot Finish: clean chroot Mock Version: 6.3 INFO: Mock Version: 6.3 Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata INFO: Guessed host environment type: unknown INFO: Using container image: registry.fedoraproject.org/fedora:43 INFO: Pulling image: registry.fedoraproject.org/fedora:43 INFO: Tagging container image as mock-bootstrap-2fe5f8da-78b0-4d80-9fdd-54c11f567ce9 INFO: Checking that fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf image matches host's architecture INFO: Copy content of container fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf to /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root INFO: mounting fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf with podman image mount INFO: image fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf as /var/lib/containers/storage/overlay/8e4aa573aacb9609a613eaf37ee7a61670ba148dc3acd2855ea41179f18ba5af/merged INFO: umounting image fbd2b7ac2fe12801f103f414deb42fb99dcc02d30a594720a2ddf5f06c31fecf (/var/lib/containers/storage/overlay/8e4aa573aacb9609a613eaf37ee7a61670ba148dc3acd2855ea41179f18ba5af/merged) with podman image umount INFO: Removing image mock-bootstrap-2fe5f8da-78b0-4d80-9fdd-54c11f567ce9 INFO: Package manager dnf5 detected and used (fallback) INFO: Not updating bootstrap chroot, bootstrap_image_ready=True Start(bootstrap): creating root cache Finish(bootstrap): creating root cache Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1759434727.591343/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Package manager dnf5 detected and used (direct choice) INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-2.fc43.x86_64 dnf5-plugins-5.2.17.0-2.fc43.x86_64 Start: installing minimal buildroot with dnf5 Updating and loading repositories: Copr repository 100% | 5.9 KiB/s | 1.6 KiB | 00m00s Additional repo https_developer_downlo 100% | 69.5 KiB/s | 47.8 KiB | 00m01s updates 100% | 34.2 KiB/s | 33.3 KiB | 00m01s Additional repo https_developer_downlo 100% | 144.9 KiB/s | 109.0 KiB | 00m01s fedora 100% | 26.7 MiB/s | 42.2 MiB | 00m02s Repositories loaded. Package Arch Version Repository Size Installing group/module packages: bash x86_64 5.3.0-2.fc43 fedora 8.4 MiB bzip2 x86_64 1.0.8-21.fc43 fedora 95.3 KiB coreutils x86_64 9.7-5.fc43 fedora 5.4 MiB cpio x86_64 2.15-6.fc43 fedora 1.1 MiB diffutils x86_64 3.12-3.fc43 fedora 1.6 MiB fedora-release-common noarch 43-0.22 fedora 20.4 KiB findutils x86_64 1:4.10.0-6.fc43 fedora 1.8 MiB gawk x86_64 5.3.2-2.fc43 fedora 1.8 MiB glibc-minimal-langpack x86_64 2.42-4.fc43 fedora 0.0 B grep x86_64 3.12-2.fc43 fedora 1.0 MiB gzip x86_64 1.13-4.fc43 fedora 388.8 KiB info x86_64 7.2-6.fc43 fedora 353.9 KiB patch x86_64 2.8-2.fc43 fedora 222.8 KiB redhat-rpm-config noarch 343-11.fc43 fedora 182.9 KiB rpm-build x86_64 6.0.0-1.fc43 fedora 287.4 KiB sed x86_64 4.9-5.fc43 fedora 857.3 KiB shadow-utils x86_64 2:4.18.0-3.fc43 fedora 3.9 MiB tar x86_64 2:1.35-6.fc43 fedora 2.9 MiB unzip x86_64 6.0-67.fc43 fedora 386.3 KiB util-linux x86_64 2.41.1-16.fc43 fedora 3.5 MiB which x86_64 2.23-3.fc43 fedora 83.5 KiB xz x86_64 1:5.8.1-2.fc43 fedora 1.3 MiB Installing dependencies: add-determinism x86_64 0.6.0-2.fc43 fedora 2.4 MiB alternatives x86_64 1.33-2.fc43 fedora 62.2 KiB ansible-srpm-macros noarch 1-18.1.fc43 fedora 35.7 KiB audit-libs x86_64 4.1.1-2.fc43 fedora 378.8 KiB binutils x86_64 2.45-1.fc43 fedora 26.5 MiB build-reproducibility-srpm-macros noarch 0.6.0-2.fc43 fedora 735.0 B bzip2-libs x86_64 1.0.8-21.fc43 fedora 80.6 KiB ca-certificates noarch 2025.2.80_v9.0.304-1.1.fc43 fedora 2.7 MiB coreutils-common x86_64 9.7-5.fc43 fedora 11.3 MiB crypto-policies noarch 20250714-5.gitcd6043a.fc43 fedora 146.9 KiB curl x86_64 8.15.0-2.fc43 fedora 473.6 KiB cyrus-sasl-lib x86_64 2.1.28-33.fc43 fedora 2.3 MiB debugedit x86_64 5.2-3.fc43 fedora 214.0 KiB dwz x86_64 0.16-2.fc43 fedora 287.1 KiB ed x86_64 1.22.2-1.fc43 fedora 148.1 KiB efi-srpm-macros noarch 6-4.fc43 fedora 40.1 KiB elfutils x86_64 0.193-3.fc43 fedora 2.9 MiB elfutils-debuginfod-client x86_64 0.193-3.fc43 fedora 83.9 KiB elfutils-default-yama-scope noarch 0.193-3.fc43 fedora 1.8 KiB elfutils-libelf x86_64 0.193-3.fc43 fedora 1.2 MiB elfutils-libs x86_64 0.193-3.fc43 fedora 683.4 KiB fedora-gpg-keys noarch 43-0.4 fedora 131.2 KiB fedora-release noarch 43-0.22 fedora 0.0 B fedora-release-identity-basic noarch 43-0.22 fedora 658.0 B fedora-repos noarch 43-0.4 fedora 4.9 KiB file x86_64 5.46-8.fc43 fedora 100.2 KiB file-libs x86_64 5.46-8.fc43 fedora 11.9 MiB filesystem x86_64 3.18-50.fc43 fedora 112.0 B filesystem-srpm-macros noarch 3.18-50.fc43 fedora 38.2 KiB fonts-srpm-macros noarch 1:2.0.5-23.fc43 fedora 55.8 KiB forge-srpm-macros noarch 0.4.0-3.fc43 fedora 38.9 KiB fpc-srpm-macros noarch 1.3-15.fc43 fedora 144.0 B gap-srpm-macros noarch 1-1.fc43 fedora 2.0 KiB gdb-minimal x86_64 16.3-6.fc43 fedora 13.3 MiB gdbm-libs x86_64 1:1.23-10.fc43 fedora 129.9 KiB ghc-srpm-macros noarch 1.9.2-3.fc43 fedora 779.0 B glibc x86_64 2.42-4.fc43 fedora 6.7 MiB glibc-common x86_64 2.42-4.fc43 fedora 1.0 MiB glibc-gconv-extra x86_64 2.42-4.fc43 fedora 7.2 MiB gmp x86_64 1:6.3.0-4.fc43 fedora 811.2 KiB gnat-srpm-macros noarch 6-8.fc43 fedora 1.0 KiB gnupg2 x86_64 2.4.8-4.fc43 fedora 6.5 MiB gnupg2-dirmngr x86_64 2.4.8-4.fc43 fedora 618.4 KiB gnupg2-gpg-agent x86_64 2.4.8-4.fc43 fedora 671.4 KiB gnupg2-gpgconf x86_64 2.4.8-4.fc43 fedora 250.0 KiB gnupg2-keyboxd x86_64 2.4.8-4.fc43 fedora 201.4 KiB gnupg2-verify x86_64 2.4.8-4.fc43 fedora 348.5 KiB gnutls x86_64 3.8.10-3.fc43 fedora 3.8 MiB go-srpm-macros noarch 3.8.0-1.fc43 fedora 61.9 KiB gpgverify noarch 2.2-3.fc43 fedora 8.7 KiB ima-evm-utils-libs x86_64 1.6.2-6.fc43 fedora 60.7 KiB jansson x86_64 2.14-3.fc43 fedora 89.1 KiB java-srpm-macros noarch 1-7.fc43 fedora 870.0 B json-c x86_64 0.18-7.fc43 fedora 82.7 KiB kernel-srpm-macros noarch 1.0-27.fc43 fedora 1.9 KiB keyutils-libs x86_64 1.6.3-6.fc43 fedora 54.3 KiB krb5-libs x86_64 1.21.3-7.fc43 fedora 2.3 MiB libacl x86_64 2.3.2-4.fc43 fedora 35.9 KiB libarchive x86_64 3.8.1-3.fc43 fedora 951.1 KiB libassuan x86_64 2.5.7-4.fc43 fedora 163.8 KiB libattr x86_64 2.5.2-6.fc43 fedora 24.4 KiB libblkid x86_64 2.41.1-16.fc43 fedora 262.4 KiB libbrotli x86_64 1.1.0-10.fc43 fedora 833.3 KiB libcap x86_64 2.76-3.fc43 fedora 209.1 KiB libcap-ng x86_64 0.8.5-7.fc43 fedora 68.9 KiB libcom_err x86_64 1.47.3-2.fc43 fedora 63.1 KiB libcurl x86_64 8.15.0-2.fc43 fedora 903.2 KiB libeconf x86_64 0.7.9-2.fc43 fedora 64.9 KiB libevent x86_64 2.1.12-16.fc43 fedora 883.1 KiB libfdisk x86_64 2.41.1-16.fc43 fedora 380.4 KiB libffi x86_64 3.5.1-2.fc43 fedora 83.6 KiB libfsverity x86_64 1.6-3.fc43 fedora 28.5 KiB libgcc x86_64 15.2.1-2.fc43 fedora 266.6 KiB libgcrypt x86_64 1.11.1-2.fc43 fedora 1.6 MiB libgomp x86_64 15.2.1-2.fc43 fedora 541.1 KiB libgpg-error x86_64 1.55-2.fc43 fedora 915.3 KiB libidn2 x86_64 2.3.8-2.fc43 fedora 552.5 KiB libksba x86_64 1.6.7-4.fc43 fedora 398.5 KiB liblastlog2 x86_64 2.41.1-16.fc43 fedora 33.9 KiB libmount x86_64 2.41.1-16.fc43 fedora 372.7 KiB libnghttp2 x86_64 1.66.0-2.fc43 fedora 162.2 KiB libpkgconf x86_64 2.3.0-3.fc43 fedora 78.1 KiB libpsl x86_64 0.21.5-6.fc43 fedora 76.4 KiB libselinux x86_64 3.9-5.fc43 fedora 193.1 KiB libsemanage x86_64 3.9-4.fc43 fedora 308.5 KiB libsepol x86_64 3.9-2.fc43 fedora 822.0 KiB libsmartcols x86_64 2.41.1-16.fc43 fedora 180.5 KiB libssh x86_64 0.11.3-1.fc43 fedora 567.1 KiB libssh-config noarch 0.11.3-1.fc43 fedora 277.0 B libstdc++ x86_64 15.2.1-2.fc43 fedora 2.8 MiB libtasn1 x86_64 4.20.0-2.fc43 fedora 176.3 KiB libtool-ltdl x86_64 2.5.4-7.fc43 fedora 70.1 KiB libunistring x86_64 1.1-10.fc43 fedora 1.7 MiB libusb1 x86_64 1.0.29-4.fc43 fedora 171.3 KiB libuuid x86_64 2.41.1-16.fc43 fedora 37.4 KiB libverto x86_64 0.3.2-11.fc43 fedora 25.4 KiB libxcrypt x86_64 4.4.38-8.fc43 fedora 284.5 KiB libxml2 x86_64 2.12.10-4.fc43 fedora 1.7 MiB libzstd x86_64 1.5.7-2.fc43 fedora 799.9 KiB lua-libs x86_64 5.4.8-2.fc43 fedora 280.8 KiB lua-srpm-macros noarch 1-16.fc43 fedora 1.3 KiB lz4-libs x86_64 1.10.0-3.fc43 fedora 161.4 KiB mpfr x86_64 4.2.2-2.fc43 fedora 832.8 KiB ncurses-base noarch 6.5-7.20250614.fc43 fedora 328.1 KiB ncurses-libs x86_64 6.5-7.20250614.fc43 fedora 946.3 KiB nettle x86_64 3.10.1-2.fc43 fedora 790.6 KiB npth x86_64 1.8-3.fc43 fedora 49.6 KiB ocaml-srpm-macros noarch 11-2.fc43 fedora 1.9 KiB openblas-srpm-macros noarch 2-20.fc43 fedora 112.0 B openldap x86_64 2.6.10-4.fc43 fedora 659.9 KiB openssl-libs x86_64 1:3.5.1-2.fc43 fedora 8.9 MiB p11-kit x86_64 0.25.8-1.fc43 fedora 2.3 MiB p11-kit-trust x86_64 0.25.8-1.fc43 fedora 446.5 KiB package-notes-srpm-macros noarch 0.5-14.fc43 fedora 1.6 KiB pam-libs x86_64 1.7.1-3.fc43 fedora 126.8 KiB pcre2 x86_64 10.46-1.fc43 fedora 697.7 KiB pcre2-syntax noarch 10.46-1.fc43 fedora 275.3 KiB perl-srpm-macros noarch 1-60.fc43 fedora 861.0 B pkgconf x86_64 2.3.0-3.fc43 fedora 88.5 KiB pkgconf-m4 noarch 2.3.0-3.fc43 fedora 14.4 KiB pkgconf-pkg-config x86_64 2.3.0-3.fc43 fedora 989.0 B popt x86_64 1.19-9.fc43 fedora 132.8 KiB publicsuffix-list-dafsa noarch 20250616-2.fc43 fedora 69.1 KiB pyproject-srpm-macros noarch 1.18.4-1.fc43 fedora 1.9 KiB python-srpm-macros noarch 3.14-5.fc43 fedora 51.5 KiB qt5-srpm-macros noarch 5.15.17-2.fc43 fedora 500.0 B qt6-srpm-macros noarch 6.9.2-1.fc43 fedora 464.0 B readline x86_64 8.3-2.fc43 fedora 511.7 KiB rpm x86_64 6.0.0-1.fc43 fedora 3.1 MiB rpm-build-libs x86_64 6.0.0-1.fc43 fedora 268.4 KiB rpm-libs x86_64 6.0.0-1.fc43 fedora 933.7 KiB rpm-sequoia x86_64 1.9.0-2.fc43 fedora 2.5 MiB rpm-sign-libs x86_64 6.0.0-1.fc43 fedora 39.7 KiB rust-srpm-macros noarch 26.4-1.fc43 fedora 4.8 KiB setup noarch 2.15.0-26.fc43 fedora 725.0 KiB sqlite-libs x86_64 3.50.2-2.fc43 fedora 1.5 MiB systemd-libs x86_64 258-1.fc43 fedora 2.3 MiB systemd-standalone-sysusers x86_64 258-1.fc43 fedora 293.5 KiB tpm2-tss x86_64 4.1.3-8.fc43 fedora 1.6 MiB tree-sitter-srpm-macros noarch 0.4.2-1.fc43 fedora 8.3 KiB util-linux-core x86_64 2.41.1-16.fc43 fedora 1.5 MiB xxhash-libs x86_64 0.8.3-3.fc43 fedora 90.2 KiB xz-libs x86_64 1:5.8.1-2.fc43 fedora 217.8 KiB zig-srpm-macros noarch 1-5.fc43 fedora 1.1 KiB zip x86_64 3.0-44.fc43 fedora 694.5 KiB zlib-ng-compat x86_64 2.2.5-2.fc43 fedora 137.6 KiB zstd x86_64 1.5.7-2.fc43 fedora 1.7 MiB Installing groups: Buildsystem building group Transaction Summary: Installing: 169 packages Total size of inbound packages is 59 MiB. Need to download 59 MiB. After this operation, 198 MiB extra will be used (install 198 MiB, remove 0 B). [ 1/169] bzip2-0:1.0.8-21.fc43.x86_64 100% | 3.6 MiB/s | 51.6 KiB | 00m00s [ 2/169] bash-0:5.3.0-2.fc43.x86_64 100% | 89.0 MiB/s | 1.9 MiB | 00m00s [ 3/169] cpio-0:2.15-6.fc43.x86_64 100% | 40.9 MiB/s | 293.1 KiB | 00m00s [ 4/169] coreutils-0:9.7-5.fc43.x86_64 100% | 49.6 MiB/s | 1.1 MiB | 00m00s [ 5/169] fedora-release-common-0:43-0. 100% | 6.1 MiB/s | 25.0 KiB | 00m00s [ 6/169] findutils-1:4.10.0-6.fc43.x86 100% | 134.3 MiB/s | 550.0 KiB | 00m00s [ 7/169] diffutils-0:3.12-3.fc43.x86_6 100% | 47.9 MiB/s | 392.3 KiB | 00m00s [ 8/169] glibc-minimal-langpack-0:2.42 100% | 9.3 MiB/s | 38.3 KiB | 00m00s [ 9/169] grep-0:3.12-2.fc43.x86_64 100% | 58.4 MiB/s | 299.1 KiB | 00m00s [ 10/169] gzip-0:1.13-4.fc43.x86_64 100% | 41.5 MiB/s | 170.1 KiB | 00m00s [ 11/169] info-0:7.2-6.fc43.x86_64 100% | 35.7 MiB/s | 182.9 KiB | 00m00s [ 12/169] patch-0:2.8-2.fc43.x86_64 100% | 37.0 MiB/s | 113.8 KiB | 00m00s [ 13/169] redhat-rpm-config-0:343-11.fc 100% | 25.8 MiB/s | 79.1 KiB | 00m00s [ 14/169] rpm-build-0:6.0.0-1.fc43.x86_ 100% | 44.9 MiB/s | 138.0 KiB | 00m00s [ 15/169] sed-0:4.9-5.fc43.x86_64 100% | 77.4 MiB/s | 317.1 KiB | 00m00s [ 16/169] tar-2:1.35-6.fc43.x86_64 100% | 119.5 MiB/s | 856.4 KiB | 00m00s [ 17/169] shadow-utils-2:4.18.0-3.fc43. 100% | 116.6 MiB/s | 1.3 MiB | 00m00s [ 18/169] unzip-0:6.0-67.fc43.x86_64 100% | 22.4 MiB/s | 183.7 KiB | 00m00s [ 19/169] which-0:2.23-3.fc43.x86_64 100% | 13.6 MiB/s | 41.7 KiB | 00m00s [ 20/169] xz-1:5.8.1-2.fc43.x86_64 100% | 55.9 MiB/s | 572.5 KiB | 00m00s [ 21/169] gawk-0:5.3.2-2.fc43.x86_64 100% | 102.2 MiB/s | 1.1 MiB | 00m00s [ 22/169] util-linux-0:2.41.1-16.fc43.x 100% | 108.3 MiB/s | 1.2 MiB | 00m00s [ 23/169] filesystem-0:3.18-50.fc43.x86 100% | 133.4 MiB/s | 1.3 MiB | 00m00s [ 24/169] glibc-0:2.42-4.fc43.x86_64 100% | 183.7 MiB/s | 2.2 MiB | 00m00s [ 25/169] ncurses-libs-0:6.5-7.20250614 100% | 29.5 MiB/s | 332.7 KiB | 00m00s [ 26/169] bzip2-libs-0:1.0.8-21.fc43.x8 100% | 10.5 MiB/s | 43.1 KiB | 00m00s [ 27/169] libacl-0:2.3.2-4.fc43.x86_64 100% | 11.9 MiB/s | 24.3 KiB | 00m00s [ 28/169] gmp-1:6.3.0-4.fc43.x86_64 100% | 103.9 MiB/s | 319.3 KiB | 00m00s [ 29/169] coreutils-common-0:9.7-5.fc43 100% | 233.4 MiB/s | 2.1 MiB | 00m00s [ 30/169] libattr-0:2.5.2-6.fc43.x86_64 100% | 2.9 MiB/s | 17.9 KiB | 00m00s [ 31/169] libcap-0:2.76-3.fc43.x86_64 100% | 17.0 MiB/s | 86.9 KiB | 00m00s [ 32/169] fedora-repos-0:43-0.4.noarch 100% | 4.4 MiB/s | 9.1 KiB | 00m00s [ 33/169] libselinux-0:3.9-5.fc43.x86_6 100% | 31.8 MiB/s | 97.7 KiB | 00m00s [ 34/169] systemd-libs-0:258-1.fc43.x86 100% | 114.4 MiB/s | 819.8 KiB | 00m00s [ 35/169] glibc-common-0:2.42-4.fc43.x8 100% | 52.9 MiB/s | 325.2 KiB | 00m00s [ 36/169] pcre2-0:10.46-1.fc43.x86_64 100% | 36.6 MiB/s | 262.2 KiB | 00m00s [ 37/169] ed-0:1.22.2-1.fc43.x86_64 100% | 16.3 MiB/s | 83.7 KiB | 00m00s [ 38/169] ansible-srpm-macros-0:1-18.1. 100% | 4.9 MiB/s | 19.9 KiB | 00m00s [ 39/169] build-reproducibility-srpm-ma 100% | 3.8 MiB/s | 11.8 KiB | 00m00s [ 40/169] dwz-0:0.16-2.fc43.x86_64 100% | 44.1 MiB/s | 135.5 KiB | 00m00s [ 41/169] efi-srpm-macros-0:6-4.fc43.no 100% | 7.3 MiB/s | 22.4 KiB | 00m00s [ 42/169] file-0:5.46-8.fc43.x86_64 100% | 15.9 MiB/s | 48.8 KiB | 00m00s [ 43/169] filesystem-srpm-macros-0:3.18 100% | 6.4 MiB/s | 26.4 KiB | 00m00s [ 44/169] forge-srpm-macros-0:0.4.0-3.f 100% | 9.8 MiB/s | 20.1 KiB | 00m00s [ 45/169] fonts-srpm-macros-1:2.0.5-23. 100% | 8.8 MiB/s | 27.2 KiB | 00m00s [ 46/169] fpc-srpm-macros-0:1.3-15.fc43 100% | 7.7 MiB/s | 7.9 KiB | 00m00s [ 47/169] gap-srpm-macros-0:1-1.fc43.no 100% | 4.2 MiB/s | 8.6 KiB | 00m00s [ 48/169] gnat-srpm-macros-0:6-8.fc43.n 100% | 2.1 MiB/s | 8.5 KiB | 00m00s [ 49/169] go-srpm-macros-0:3.8.0-1.fc43 100% | 13.8 MiB/s | 28.3 KiB | 00m00s [ 50/169] java-srpm-macros-0:1-7.fc43.n 100% | 7.8 MiB/s | 7.9 KiB | 00m00s [ 51/169] lua-srpm-macros-0:1-16.fc43.n 100% | 8.6 MiB/s | 8.8 KiB | 00m00s [ 52/169] ghc-srpm-macros-0:1.9.2-3.fc4 100% | 1.1 MiB/s | 8.7 KiB | 00m00s [ 53/169] kernel-srpm-macros-0:1.0-27.f 100% | 2.9 MiB/s | 8.9 KiB | 00m00s [ 54/169] package-notes-srpm-macros-0:0 100% | 8.8 MiB/s | 9.0 KiB | 00m00s [ 55/169] openblas-srpm-macros-0:2-20.f 100% | 1.9 MiB/s | 7.6 KiB | 00m00s [ 56/169] ocaml-srpm-macros-0:11-2.fc43 100% | 1.8 MiB/s | 9.3 KiB | 00m00s [ 57/169] perl-srpm-macros-0:1-60.fc43. 100% | 4.0 MiB/s | 8.3 KiB | 00m00s [ 58/169] pyproject-srpm-macros-0:1.18. 100% | 6.7 MiB/s | 13.7 KiB | 00m00s [ 59/169] qt5-srpm-macros-0:5.15.17-2.f 100% | 8.5 MiB/s | 8.7 KiB | 00m00s [ 60/169] python-srpm-macros-0:3.14-5.f 100% | 11.4 MiB/s | 23.4 KiB | 00m00s [ 61/169] rust-srpm-macros-0:26.4-1.fc4 100% | 5.4 MiB/s | 11.1 KiB | 00m00s [ 62/169] qt6-srpm-macros-0:6.9.2-1.fc4 100% | 3.1 MiB/s | 9.4 KiB | 00m00s [ 63/169] rpm-0:6.0.0-1.fc43.x86_64 100% | 112.6 MiB/s | 576.3 KiB | 00m00s [ 64/169] zig-srpm-macros-0:1-5.fc43.no 100% | 2.7 MiB/s | 8.4 KiB | 00m00s [ 65/169] tree-sitter-srpm-macros-0:0.4 100% | 2.6 MiB/s | 13.4 KiB | 00m00s [ 66/169] zip-0:3.0-44.fc43.x86_64 100% | 85.1 MiB/s | 261.6 KiB | 00m00s [ 67/169] debugedit-0:5.2-3.fc43.x86_64 100% | 27.9 MiB/s | 85.6 KiB | 00m00s [ 68/169] elfutils-0:0.193-3.fc43.x86_6 100% | 139.5 MiB/s | 571.3 KiB | 00m00s [ 69/169] elfutils-libelf-0:0.193-3.fc4 100% | 50.7 MiB/s | 207.8 KiB | 00m00s [ 70/169] libarchive-0:3.8.1-3.fc43.x86 100% | 82.3 MiB/s | 421.1 KiB | 00m00s [ 71/169] libgcc-0:15.2.1-2.fc43.x86_64 100% | 26.0 MiB/s | 133.0 KiB | 00m00s [ 72/169] libstdc++-0:15.2.1-2.fc43.x86 100% | 149.8 MiB/s | 920.1 KiB | 00m00s [ 73/169] popt-0:1.19-9.fc43.x86_64 100% | 9.2 MiB/s | 65.7 KiB | 00m00s [ 74/169] readline-0:8.3-2.fc43.x86_64 100% | 31.3 MiB/s | 224.6 KiB | 00m00s [ 75/169] rpm-build-libs-0:6.0.0-1.fc43 100% | 31.2 MiB/s | 127.9 KiB | 00m00s [ 76/169] rpm-libs-0:6.0.0-1.fc43.x86_6 100% | 78.2 MiB/s | 400.2 KiB | 00m00s [ 77/169] zstd-0:1.5.7-2.fc43.x86_64 100% | 79.1 MiB/s | 485.9 KiB | 00m00s [ 78/169] audit-libs-0:4.1.1-2.fc43.x86 100% | 22.5 MiB/s | 138.5 KiB | 00m00s [ 79/169] libeconf-0:0.7.9-2.fc43.x86_6 100% | 8.6 MiB/s | 35.2 KiB | 00m00s [ 80/169] libsemanage-0:3.9-4.fc43.x86_ 100% | 40.2 MiB/s | 123.5 KiB | 00m00s [ 81/169] pam-libs-0:1.7.1-3.fc43.x86_6 100% | 28.1 MiB/s | 57.5 KiB | 00m00s [ 82/169] libxcrypt-0:4.4.38-8.fc43.x86 100% | 31.0 MiB/s | 127.0 KiB | 00m00s [ 83/169] setup-0:2.15.0-26.fc43.noarch 100% | 51.2 MiB/s | 157.3 KiB | 00m00s [ 84/169] xz-libs-1:5.8.1-2.fc43.x86_64 100% | 36.8 MiB/s | 112.9 KiB | 00m00s [ 85/169] mpfr-0:4.2.2-2.fc43.x86_64 100% | 112.9 MiB/s | 347.0 KiB | 00m00s [ 86/169] libcap-ng-0:0.8.5-7.fc43.x86_ 100% | 15.7 MiB/s | 32.2 KiB | 00m00s [ 87/169] libblkid-0:2.41.1-16.fc43.x86 100% | 40.1 MiB/s | 123.1 KiB | 00m00s [ 88/169] libfdisk-0:2.41.1-16.fc43.x86 100% | 52.5 MiB/s | 161.1 KiB | 00m00s [ 89/169] liblastlog2-0:2.41.1-16.fc43. 100% | 7.5 MiB/s | 23.2 KiB | 00m00s [ 90/169] libmount-0:2.41.1-16.fc43.x86 100% | 39.7 MiB/s | 162.4 KiB | 00m00s [ 91/169] libuuid-0:2.41.1-16.fc43.x86_ 100% | 8.5 MiB/s | 26.0 KiB | 00m00s [ 92/169] libsmartcols-0:2.41.1-16.fc43 100% | 20.5 MiB/s | 84.0 KiB | 00m00s [ 93/169] util-linux-core-0:2.41.1-16.f 100% | 134.5 MiB/s | 550.9 KiB | 00m00s [ 94/169] zlib-ng-compat-0:2.2.5-2.fc43 100% | 25.8 MiB/s | 79.2 KiB | 00m00s [ 95/169] glibc-gconv-extra-0:2.42-4.fc 100% | 198.1 MiB/s | 1.6 MiB | 00m00s [ 96/169] libsepol-0:3.9-2.fc43.x86_64 100% | 56.2 MiB/s | 345.4 KiB | 00m00s [ 97/169] ncurses-base-0:6.5-7.20250614 100% | 14.4 MiB/s | 88.2 KiB | 00m00s [ 98/169] fedora-gpg-keys-0:43-0.4.noar 100% | 45.2 MiB/s | 138.8 KiB | 00m00s [ 99/169] pcre2-syntax-0:10.46-1.fc43.n 100% | 52.8 MiB/s | 162.2 KiB | 00m00s [100/169] add-determinism-0:0.6.0-2.fc4 100% | 149.6 MiB/s | 919.3 KiB | 00m00s [101/169] curl-0:8.15.0-2.fc43.x86_64 100% | 38.0 MiB/s | 233.7 KiB | 00m00s [102/169] file-libs-0:5.46-8.fc43.x86_6 100% | 103.8 MiB/s | 850.3 KiB | 00m00s [103/169] elfutils-libs-0:0.193-3.fc43. 100% | 43.9 MiB/s | 269.7 KiB | 00m00s [104/169] elfutils-debuginfod-client-0: 100% | 15.2 MiB/s | 46.8 KiB | 00m00s [105/169] libzstd-0:1.5.7-2.fc43.x86_64 100% | 76.8 MiB/s | 314.6 KiB | 00m00s [106/169] libxml2-0:2.12.10-4.fc43.x86_ 100% | 135.2 MiB/s | 692.5 KiB | 00m00s [107/169] lz4-libs-0:1.10.0-3.fc43.x86_ 100% | 15.2 MiB/s | 78.0 KiB | 00m00s [108/169] libgomp-0:15.2.1-2.fc43.x86_6 100% | 60.7 MiB/s | 372.9 KiB | 00m00s [109/169] rpm-sign-libs-0:6.0.0-1.fc43. 100% | 9.2 MiB/s | 28.2 KiB | 00m00s [110/169] lua-libs-0:5.4.8-2.fc43.x86_6 100% | 42.9 MiB/s | 131.7 KiB | 00m00s [111/169] elfutils-default-yama-scope-0 100% | 12.1 MiB/s | 12.4 KiB | 00m00s [112/169] rpm-sequoia-0:1.9.0-2.fc43.x8 100% | 151.9 MiB/s | 933.3 KiB | 00m00s [113/169] sqlite-libs-0:3.50.2-2.fc43.x 100% | 123.8 MiB/s | 760.5 KiB | 00m00s [114/169] json-c-0:0.18-7.fc43.x86_64 100% | 8.8 MiB/s | 45.0 KiB | 00m00s [115/169] ima-evm-utils-libs-0:1.6.2-6. 100% | 7.2 MiB/s | 29.3 KiB | 00m00s [116/169] libfsverity-0:1.6-3.fc43.x86_ 100% | 3.6 MiB/s | 18.6 KiB | 00m00s [117/169] gnupg2-0:2.4.8-4.fc43.x86_64 100% | 164.4 MiB/s | 1.6 MiB | 00m00s [118/169] gpgverify-0:2.2-3.fc43.noarch 100% | 3.6 MiB/s | 11.1 KiB | 00m00s [119/169] gnupg2-dirmngr-0:2.4.8-4.fc43 100% | 53.6 MiB/s | 274.6 KiB | 00m00s [120/169] gnupg2-gpg-agent-0:2.4.8-4.fc 100% | 38.1 MiB/s | 272.9 KiB | 00m00s [121/169] openssl-libs-1:3.5.1-2.fc43.x 100% | 163.5 MiB/s | 2.6 MiB | 00m00s [122/169] gnupg2-gpgconf-0:2.4.8-4.fc43 100% | 16.0 MiB/s | 115.0 KiB | 00m00s [123/169] gnupg2-keyboxd-0:2.4.8-4.fc43 100% | 15.4 MiB/s | 94.7 KiB | 00m00s [124/169] gnupg2-verify-0:2.4.8-4.fc43. 100% | 27.9 MiB/s | 171.2 KiB | 00m00s [125/169] libassuan-0:2.5.7-4.fc43.x86_ 100% | 16.5 MiB/s | 67.4 KiB | 00m00s [126/169] libgcrypt-0:1.11.1-2.fc43.x86 100% | 116.4 MiB/s | 595.8 KiB | 00m00s [127/169] libgpg-error-0:1.55-2.fc43.x8 100% | 59.6 MiB/s | 244.3 KiB | 00m00s [128/169] npth-0:1.8-3.fc43.x86_64 100% | 6.3 MiB/s | 25.7 KiB | 00m00s [129/169] tpm2-tss-0:4.1.3-8.fc43.x86_6 100% | 104.0 MiB/s | 425.9 KiB | 00m00s [130/169] crypto-policies-0:20250714-5. 100% | 32.0 MiB/s | 98.5 KiB | 00m00s [131/169] ca-certificates-0:2025.2.80_v 100% | 158.7 MiB/s | 975.4 KiB | 00m00s [132/169] libksba-0:1.6.7-4.fc43.x86_64 100% | 39.2 MiB/s | 160.4 KiB | 00m00s [133/169] gnutls-0:3.8.10-3.fc43.x86_64 100% | 175.3 MiB/s | 1.4 MiB | 00m00s [134/169] openldap-0:2.6.10-4.fc43.x86_ 100% | 42.3 MiB/s | 259.6 KiB | 00m00s [135/169] libusb1-0:1.0.29-4.fc43.x86_6 100% | 19.5 MiB/s | 79.9 KiB | 00m00s [136/169] libidn2-0:2.3.8-2.fc43.x86_64 100% | 56.9 MiB/s | 174.9 KiB | 00m00s [137/169] libtasn1-0:4.20.0-2.fc43.x86_ 100% | 24.2 MiB/s | 74.5 KiB | 00m00s [138/169] nettle-0:3.10.1-2.fc43.x86_64 100% | 103.6 MiB/s | 424.2 KiB | 00m00s [139/169] libunistring-0:1.1-10.fc43.x8 100% | 75.7 MiB/s | 542.9 KiB | 00m00s [140/169] p11-kit-0:0.25.8-1.fc43.x86_6 100% | 82.0 MiB/s | 503.8 KiB | 00m00s [141/169] cyrus-sasl-lib-0:2.1.28-33.fc 100% | 153.9 MiB/s | 787.9 KiB | 00m00s [142/169] libtool-ltdl-0:2.5.4-7.fc43.x 100% | 11.8 MiB/s | 36.2 KiB | 00m00s [143/169] libevent-0:2.1.12-16.fc43.x86 100% | 42.0 MiB/s | 257.8 KiB | 00m00s [144/169] libffi-0:3.5.1-2.fc43.x86_64 100% | 10.0 MiB/s | 40.9 KiB | 00m00s [145/169] alternatives-0:1.33-2.fc43.x8 100% | 19.9 MiB/s | 40.7 KiB | 00m00s [146/169] gdbm-libs-1:1.23-10.fc43.x86_ 100% | 9.2 MiB/s | 56.8 KiB | 00m00s [147/169] jansson-0:2.14-3.fc43.x86_64 100% | 6.3 MiB/s | 45.3 KiB | 00m00s [148/169] pkgconf-pkg-config-0:2.3.0-3. 100% | 1.0 MiB/s | 9.6 KiB | 00m00s [149/169] pkgconf-0:2.3.0-3.fc43.x86_64 100% | 8.7 MiB/s | 44.6 KiB | 00m00s [150/169] pkgconf-m4-0:2.3.0-3.fc43.noa 100% | 3.4 MiB/s | 13.9 KiB | 00m00s [151/169] libpkgconf-0:2.3.0-3.fc43.x86 100% | 12.3 MiB/s | 37.9 KiB | 00m00s [152/169] p11-kit-trust-0:0.25.8-1.fc43 100% | 27.3 MiB/s | 139.6 KiB | 00m00s [153/169] fedora-release-0:43-0.22.noar 100% | 3.4 MiB/s | 14.0 KiB | 00m00s [154/169] binutils-0:2.45-1.fc43.x86_64 100% | 183.5 MiB/s | 5.9 MiB | 00m00s [155/169] systemd-standalone-sysusers-0 100% | 17.6 MiB/s | 143.8 KiB | 00m00s [156/169] xxhash-libs-0:0.8.3-3.fc43.x8 100% | 7.5 MiB/s | 38.5 KiB | 00m00s [157/169] fedora-release-identity-basic 100% | 4.8 MiB/s | 14.7 KiB | 00m00s [158/169] libcurl-0:8.15.0-2.fc43.x86_6 100% | 79.0 MiB/s | 404.3 KiB | 00m00s [159/169] krb5-libs-0:1.21.3-7.fc43.x86 100% | 105.9 MiB/s | 758.9 KiB | 00m00s [160/169] gdb-minimal-0:16.3-6.fc43.x86 100% | 176.3 MiB/s | 4.4 MiB | 00m00s [161/169] libbrotli-0:1.1.0-10.fc43.x86 100% | 36.8 MiB/s | 339.1 KiB | 00m00s [162/169] libnghttp2-0:1.66.0-2.fc43.x8 100% | 10.1 MiB/s | 72.5 KiB | 00m00s [163/169] keyutils-libs-0:1.6.3-6.fc43. 100% | 15.3 MiB/s | 31.4 KiB | 00m00s [164/169] libpsl-0:0.21.5-6.fc43.x86_64 100% | 15.9 MiB/s | 65.0 KiB | 00m00s [165/169] libssh-0:0.11.3-1.fc43.x86_64 100% | 56.8 MiB/s | 232.8 KiB | 00m00s [166/169] libcom_err-0:1.47.3-2.fc43.x8 100% | 13.1 MiB/s | 26.8 KiB | 00m00s [167/169] libverto-0:0.3.2-11.fc43.x86_ 100% | 10.1 MiB/s | 20.7 KiB | 00m00s [168/169] publicsuffix-list-dafsa-0:202 100% | 28.9 MiB/s | 59.2 KiB | 00m00s [169/169] libssh-config-0:0.11.3-1.fc43 100% | 4.4 MiB/s | 9.1 KiB | 00m00s -------------------------------------------------------------------------------- [169/169] Total 100% | 175.4 MiB/s | 59.0 MiB | 00m00s Running transaction Importing OpenPGP key 0x31645531: UserID : "Fedora (43) " Fingerprint: C6E7F081CF80E13146676E88829B606631645531 From : file:///usr/share/distribution-gpg-keys/fedora/RPM-GPG-KEY-fedora-43-primary The key was successfully imported. [ 1/171] Verify package files 100% | 789.0 B/s | 169.0 B | 00m00s [ 2/171] Prepare transaction 100% | 4.2 KiB/s | 169.0 B | 00m00s [ 3/171] Installing libgcc-0:15.2.1-2. 100% | 262.0 MiB/s | 268.3 KiB | 00m00s [ 4/171] Installing libssh-config-0:0. 100% | 0.0 B/s | 816.0 B | 00m00s [ 5/171] Installing publicsuffix-list- 100% | 0.0 B/s | 69.8 KiB | 00m00s [ 6/171] Installing fedora-release-ide 100% | 0.0 B/s | 916.0 B | 00m00s [ 7/171] Installing fedora-gpg-keys-0: 100% | 58.3 MiB/s | 179.0 KiB | 00m00s [ 8/171] Installing fedora-repos-0:43- 100% | 0.0 B/s | 5.7 KiB | 00m00s [ 9/171] Installing fedora-release-com 100% | 24.2 MiB/s | 24.7 KiB | 00m00s [ 10/171] Installing fedora-release-0:4 100% | 17.3 KiB/s | 124.0 B | 00m00s >>> Running sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Finished sysusers scriptlet: setup-0:2.15.0-26.fc43.noarch >>> Scriptlet output: >>> Creating group 'adm' with GID 4. >>> Creating group 'audio' with GID 63. >>> Creating group 'cdrom' with GID 11. >>> Creating group 'clock' with GID 103. >>> Creating group 'dialout' with GID 18. >>> Creating group 'disk' with GID 6. >>> Creating group 'floppy' with GID 19. >>> Creating group 'ftp' with GID 50. >>> Creating group 'games' with GID 20. >>> Creating group 'input' with GID 104. >>> Creating group 'kmem' with GID 9. >>> Creating group 'kvm' with GID 36. >>> Creating group 'lock' with GID 54. >>> Creating group 'lp' with GID 7. >>> Creating group 'mail' with GID 12. >>> Creating group 'man' with GID 15. >>> Creating group 'mem' with GID 8. >>> Creating group 'nobody' with GID 65534. >>> Creating group 'render' with GID 105. >>> Creating group 'root' with GID 0. >>> Creating group 'sgx' with GID 106. >>> Creating group 'sys' with GID 3. >>> Creating group 'tape' with GID 33. >>> Creating group 'tty' with GID 5. >>> Creating group 'users' with GID 100. >>> Creating group 'utmp' with GID 22. >>> Creating group 'video' with GID 39. >>> Creating group 'wheel' with GID 10. >>> Creating user 'adm' (adm) with UID 3 and GID 4. >>> Creating group 'bin' with GID 1. >>> Creating user 'bin' (bin) with UID 1 and GID 1. >>> Creating group 'daemon' with GID 2. >>> Creating user 'daemon' (daemon) with UID 2 and GID 2. >>> Creating user 'ftp' (FTP User) with UID 14 and GID 50. >>> Creating user 'games' (games) with UID 12 and GID 100. >>> Creating user 'halt' (halt) with UID 7 and GID 0. >>> Creating user 'lp' (lp) with UID 4 and GID 7. >>> Creating user 'mail' (mail) with UID 8 and GID 12. >>> Creating user 'nobody' (Kernel Overflow User) with UID 65534 and GID 65534. >>> Creating user 'operator' (operator) with UID 11 and GID 0. >>> Creating user 'root' (Super User) with UID 0 and GID 0. >>> Creating user 'shutdown' (shutdown) with UID 6 and GID 0. >>> Creating user 'sync' (sync) with UID 5 and GID 0. >>> [ 11/171] Installing setup-0:2.15.0-26. 100% | 54.9 MiB/s | 730.6 KiB | 00m00s >>> [RPM] /etc/hosts created as /etc/hosts.rpmnew [ 12/171] Installing filesystem-0:3.18- 100% | 3.1 MiB/s | 212.8 KiB | 00m00s [ 13/171] Installing pkgconf-m4-0:2.3.0 100% | 0.0 B/s | 14.8 KiB | 00m00s [ 14/171] Installing pcre2-syntax-0:10. 100% | 271.2 MiB/s | 277.8 KiB | 00m00s [ 15/171] Installing ncurses-base-0:6.5 100% | 115.1 MiB/s | 353.5 KiB | 00m00s [ 16/171] Installing bash-0:5.3.0-2.fc4 100% | 290.7 MiB/s | 8.4 MiB | 00m00s [ 17/171] Installing glibc-common-0:2.4 100% | 68.0 MiB/s | 1.0 MiB | 00m00s [ 18/171] Installing glibc-gconv-extra- 100% | 317.8 MiB/s | 7.3 MiB | 00m00s [ 19/171] Installing glibc-0:2.42-4.fc4 100% | 203.1 MiB/s | 6.7 MiB | 00m00s [ 20/171] Installing ncurses-libs-0:6.5 100% | 310.1 MiB/s | 952.8 KiB | 00m00s [ 21/171] Installing glibc-minimal-lang 100% | 0.0 B/s | 124.0 B | 00m00s [ 22/171] Installing zlib-ng-compat-0:2 100% | 135.2 MiB/s | 138.4 KiB | 00m00s [ 23/171] Installing bzip2-libs-0:1.0.8 100% | 79.8 MiB/s | 81.7 KiB | 00m00s [ 24/171] Installing libgpg-error-0:1.5 100% | 69.2 MiB/s | 921.1 KiB | 00m00s [ 25/171] Installing libstdc++-0:15.2.1 100% | 406.3 MiB/s | 2.8 MiB | 00m00s [ 26/171] Installing xz-libs-1:5.8.1-2. 100% | 213.8 MiB/s | 218.9 KiB | 00m00s [ 27/171] Installing libassuan-0:2.5.7- 100% | 161.7 MiB/s | 165.6 KiB | 00m00s [ 28/171] Installing libgcrypt-0:1.11.1 100% | 393.8 MiB/s | 1.6 MiB | 00m00s [ 29/171] Installing readline-0:8.3-2.f 100% | 501.8 MiB/s | 513.9 KiB | 00m00s [ 30/171] Installing gmp-1:6.3.0-4.fc43 100% | 397.2 MiB/s | 813.5 KiB | 00m00s [ 31/171] Installing libuuid-0:2.41.1-1 100% | 0.0 B/s | 38.3 KiB | 00m00s [ 32/171] Installing popt-0:1.19-9.fc43 100% | 68.1 MiB/s | 139.4 KiB | 00m00s [ 33/171] Installing npth-0:1.8-3.fc43. 100% | 0.0 B/s | 50.7 KiB | 00m00s [ 34/171] Installing libblkid-0:2.41.1- 100% | 257.4 MiB/s | 263.5 KiB | 00m00s [ 35/171] Installing libxcrypt-0:4.4.38 100% | 280.4 MiB/s | 287.2 KiB | 00m00s [ 36/171] Installing libzstd-0:1.5.7-2. 100% | 391.2 MiB/s | 801.1 KiB | 00m00s [ 37/171] Installing elfutils-libelf-0: 100% | 388.8 MiB/s | 1.2 MiB | 00m00s [ 38/171] Installing sqlite-libs-0:3.50 100% | 379.1 MiB/s | 1.5 MiB | 00m00s [ 39/171] Installing gnupg2-gpgconf-0:2 100% | 22.4 MiB/s | 252.0 KiB | 00m00s [ 40/171] Installing libattr-0:2.5.2-6. 100% | 0.0 B/s | 25.4 KiB | 00m00s [ 41/171] Installing libacl-0:2.3.2-4.f 100% | 0.0 B/s | 36.8 KiB | 00m00s [ 42/171] Installing libtasn1-0:4.20.0- 100% | 173.9 MiB/s | 178.1 KiB | 00m00s [ 43/171] Installing libunistring-0:1.1 100% | 431.7 MiB/s | 1.7 MiB | 00m00s [ 44/171] Installing libidn2-0:2.3.8-2. 100% | 68.2 MiB/s | 558.7 KiB | 00m00s [ 45/171] Installing crypto-policies-0: 100% | 42.0 MiB/s | 172.0 KiB | 00m00s [ 46/171] Installing dwz-0:0.16-2.fc43. 100% | 23.5 MiB/s | 288.5 KiB | 00m00s [ 47/171] Installing gnupg2-verify-0:2. 100% | 28.5 MiB/s | 349.9 KiB | 00m00s [ 48/171] Installing mpfr-0:4.2.2-2.fc4 100% | 407.4 MiB/s | 834.4 KiB | 00m00s [ 49/171] Installing gawk-0:5.3.2-2.fc4 100% | 113.5 MiB/s | 1.8 MiB | 00m00s [ 50/171] Installing libksba-0:1.6.7-4. 100% | 391.7 MiB/s | 401.1 KiB | 00m00s [ 51/171] Installing unzip-0:6.0-67.fc4 100% | 31.7 MiB/s | 389.8 KiB | 00m00s [ 52/171] Installing file-libs-0:5.46-8 100% | 741.1 MiB/s | 11.9 MiB | 00m00s [ 53/171] Installing file-0:5.46-8.fc43 100% | 9.0 MiB/s | 101.7 KiB | 00m00s [ 54/171] Installing pcre2-0:10.46-1.fc 100% | 341.4 MiB/s | 699.1 KiB | 00m00s [ 55/171] Installing grep-0:3.12-2.fc43 100% | 71.6 MiB/s | 1.0 MiB | 00m00s [ 56/171] Installing xz-1:5.8.1-2.fc43. 100% | 88.8 MiB/s | 1.3 MiB | 00m00s [ 57/171] Installing libeconf-0:0.7.9-2 100% | 0.0 B/s | 66.5 KiB | 00m00s [ 58/171] Installing libcap-ng-0:0.8.5- 100% | 69.2 MiB/s | 70.8 KiB | 00m00s [ 59/171] Installing audit-libs-0:4.1.1 100% | 372.6 MiB/s | 381.5 KiB | 00m00s [ 60/171] Installing pam-libs-0:1.7.1-3 100% | 126.0 MiB/s | 129.0 KiB | 00m00s [ 61/171] Installing libcap-0:2.76-3.fc 100% | 17.4 MiB/s | 214.3 KiB | 00m00s [ 62/171] Installing systemd-libs-0:258 100% | 387.5 MiB/s | 2.3 MiB | 00m00s [ 63/171] Installing libsmartcols-0:2.4 100% | 177.3 MiB/s | 181.6 KiB | 00m00s [ 64/171] Installing libsepol-0:3.9-2.f 100% | 401.8 MiB/s | 822.9 KiB | 00m00s [ 65/171] Installing libselinux-0:3.9-5 100% | 189.8 MiB/s | 194.4 KiB | 00m00s [ 66/171] Installing findutils-1:4.10.0 100% | 123.9 MiB/s | 1.9 MiB | 00m00s [ 67/171] Installing sed-0:4.9-5.fc43.x 100% | 60.4 MiB/s | 865.5 KiB | 00m00s [ 68/171] Installing libmount-0:2.41.1- 100% | 364.9 MiB/s | 373.7 KiB | 00m00s [ 69/171] Installing lz4-libs-0:1.10.0- 100% | 158.6 MiB/s | 162.5 KiB | 00m00s [ 70/171] Installing lua-libs-0:5.4.8-2 100% | 275.3 MiB/s | 281.9 KiB | 00m00s [ 71/171] Installing json-c-0:0.18-7.fc 100% | 0.0 B/s | 84.0 KiB | 00m00s [ 72/171] Installing libffi-0:3.5.1-2.f 100% | 83.0 MiB/s | 85.0 KiB | 00m00s [ 73/171] Installing p11-kit-0:0.25.8-1 100% | 127.3 MiB/s | 2.3 MiB | 00m00s [ 74/171] Installing alternatives-0:1.3 100% | 5.7 MiB/s | 63.8 KiB | 00m00s [ 75/171] Installing p11-kit-trust-0:0. 100% | 24.3 MiB/s | 448.2 KiB | 00m00s [ 76/171] Installing zstd-0:1.5.7-2.fc4 100% | 114.0 MiB/s | 1.7 MiB | 00m00s [ 77/171] Installing util-linux-core-0: 100% | 92.5 MiB/s | 1.5 MiB | 00m00s [ 78/171] Installing tar-2:1.35-6.fc43. 100% | 164.3 MiB/s | 3.0 MiB | 00m00s [ 79/171] Installing libsemanage-0:3.9- 100% | 303.0 MiB/s | 310.2 KiB | 00m00s [ 80/171] Installing systemd-standalone 100% | 26.1 MiB/s | 294.1 KiB | 00m00s [ 81/171] Installing libusb1-0:1.0.29-4 100% | 168.9 MiB/s | 172.9 KiB | 00m00s [ 82/171] Installing zip-0:3.0-44.fc43. 100% | 56.8 MiB/s | 698.4 KiB | 00m00s [ 83/171] Installing gnupg2-keyboxd-0:2 100% | 39.6 MiB/s | 202.7 KiB | 00m00s [ 84/171] Installing libpsl-0:0.21.5-6. 100% | 75.7 MiB/s | 77.5 KiB | 00m00s [ 85/171] Installing liblastlog2-0:2.41 100% | 7.0 MiB/s | 36.0 KiB | 00m00s [ 86/171] Installing libfdisk-0:2.41.1- 100% | 186.3 MiB/s | 381.5 KiB | 00m00s [ 87/171] Installing nettle-0:3.10.1-2. 100% | 387.5 MiB/s | 793.7 KiB | 00m00s [ 88/171] Installing gnutls-0:3.8.10-3. 100% | 383.9 MiB/s | 3.8 MiB | 00m00s [ 89/171] Installing libxml2-0:2.12.10- 100% | 106.5 MiB/s | 1.7 MiB | 00m00s [ 90/171] Installing bzip2-0:1.0.8-21.f 100% | 8.9 MiB/s | 99.8 KiB | 00m00s [ 91/171] Installing add-determinism-0: 100% | 152.7 MiB/s | 2.4 MiB | 00m00s [ 92/171] Installing build-reproducibil 100% | 0.0 B/s | 1.0 KiB | 00m00s [ 93/171] Installing cpio-0:2.15-6.fc43 100% | 78.5 MiB/s | 1.1 MiB | 00m00s [ 94/171] Installing diffutils-0:3.12-3 100% | 104.1 MiB/s | 1.6 MiB | 00m00s [ 95/171] Installing ed-0:1.22.2-1.fc43 100% | 13.3 MiB/s | 150.4 KiB | 00m00s [ 96/171] Installing patch-0:2.8-2.fc43 100% | 19.9 MiB/s | 224.3 KiB | 00m00s [ 97/171] Installing libgomp-0:15.2.1-2 100% | 264.9 MiB/s | 542.5 KiB | 00m00s [ 98/171] Installing libtool-ltdl-0:2.5 100% | 0.0 B/s | 71.2 KiB | 00m00s [ 99/171] Installing gdbm-libs-1:1.23-1 100% | 128.5 MiB/s | 131.6 KiB | 00m00s [100/171] Installing cyrus-sasl-lib-0:2 100% | 143.5 MiB/s | 2.3 MiB | 00m00s [101/171] Installing jansson-0:2.14-3.f 100% | 88.3 MiB/s | 90.5 KiB | 00m00s [102/171] Installing libpkgconf-0:2.3.0 100% | 0.0 B/s | 79.2 KiB | 00m00s [103/171] Installing pkgconf-0:2.3.0-3. 100% | 8.1 MiB/s | 91.0 KiB | 00m00s [104/171] Installing pkgconf-pkg-config 100% | 177.3 KiB/s | 1.8 KiB | 00m00s [105/171] Installing xxhash-libs-0:0.8. 100% | 89.4 MiB/s | 91.6 KiB | 00m00s [106/171] Installing libbrotli-0:1.1.0- 100% | 272.0 MiB/s | 835.6 KiB | 00m00s [107/171] Installing libnghttp2-0:1.66. 100% | 159.5 MiB/s | 163.3 KiB | 00m00s [108/171] Installing keyutils-libs-0:1. 100% | 54.4 MiB/s | 55.7 KiB | 00m00s [109/171] Installing libcom_err-0:1.47. 100% | 0.0 B/s | 64.2 KiB | 00m00s [110/171] Installing libverto-0:0.3.2-1 100% | 0.0 B/s | 27.2 KiB | 00m00s [111/171] Installing filesystem-srpm-ma 100% | 0.0 B/s | 38.9 KiB | 00m00s [112/171] Installing elfutils-default-y 100% | 408.6 KiB/s | 2.0 KiB | 00m00s [113/171] Installing elfutils-libs-0:0. 100% | 334.6 MiB/s | 685.2 KiB | 00m00s [114/171] Installing rust-srpm-macros-0 100% | 0.0 B/s | 5.6 KiB | 00m00s [115/171] Installing qt6-srpm-macros-0: 100% | 0.0 B/s | 740.0 B | 00m00s [116/171] Installing qt5-srpm-macros-0: 100% | 0.0 B/s | 776.0 B | 00m00s [117/171] Installing perl-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [118/171] Installing package-notes-srpm 100% | 0.0 B/s | 2.0 KiB | 00m00s [119/171] Installing openblas-srpm-macr 100% | 0.0 B/s | 392.0 B | 00m00s [120/171] Installing ocaml-srpm-macros- 100% | 0.0 B/s | 2.1 KiB | 00m00s [121/171] Installing kernel-srpm-macros 100% | 0.0 B/s | 2.3 KiB | 00m00s [122/171] Installing gnat-srpm-macros-0 100% | 0.0 B/s | 1.3 KiB | 00m00s [123/171] Installing ghc-srpm-macros-0: 100% | 0.0 B/s | 1.0 KiB | 00m00s [124/171] Installing gap-srpm-macros-0: 100% | 0.0 B/s | 2.6 KiB | 00m00s [125/171] Installing fpc-srpm-macros-0: 100% | 0.0 B/s | 420.0 B | 00m00s [126/171] Installing ansible-srpm-macro 100% | 35.4 MiB/s | 36.2 KiB | 00m00s [127/171] Installing coreutils-common-0 100% | 434.3 MiB/s | 11.3 MiB | 00m00s [128/171] Installing openssl-libs-1:3.5 100% | 468.6 MiB/s | 8.9 MiB | 00m00s [129/171] Installing coreutils-0:9.7-5. 100% | 175.7 MiB/s | 5.4 MiB | 00m00s [130/171] Installing ca-certificates-0: 100% | 2.1 MiB/s | 2.5 MiB | 00m01s [131/171] Installing libarchive-0:3.8.1 100% | 310.2 MiB/s | 953.1 KiB | 00m00s [132/171] Installing krb5-libs-0:1.21.3 100% | 176.3 MiB/s | 2.3 MiB | 00m00s >>> Running sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Finished sysusers scriptlet: tpm2-tss-0:4.1.3-8.fc43.x86_64 >>> Scriptlet output: >>> Creating group 'tss' with GID 59. >>> Creating user 'tss' (Account used for TPM access) with UID 59 and GID 59. >>> [133/171] Installing tpm2-tss-0:4.1.3-8 100% | 314.4 MiB/s | 1.6 MiB | 00m00s [134/171] Installing ima-evm-utils-libs 100% | 60.5 MiB/s | 62.0 KiB | 00m00s [135/171] Installing gnupg2-gpg-agent-0 100% | 34.7 MiB/s | 675.4 KiB | 00m00s [136/171] Installing libssh-0:0.11.3-1. 100% | 277.9 MiB/s | 569.2 KiB | 00m00s [137/171] Installing gzip-0:1.13-4.fc43 100% | 32.1 MiB/s | 394.4 KiB | 00m00s [138/171] Installing rpm-sequoia-0:1.9. 100% | 413.1 MiB/s | 2.5 MiB | 00m00s [139/171] Installing rpm-libs-0:6.0.0-1 100% | 304.4 MiB/s | 935.2 KiB | 00m00s [140/171] Installing libfsverity-0:1.6- 100% | 0.0 B/s | 29.5 KiB | 00m00s [141/171] Installing libevent-0:2.1.12- 100% | 288.7 MiB/s | 886.8 KiB | 00m00s [142/171] Installing openldap-0:2.6.10- 100% | 324.1 MiB/s | 663.7 KiB | 00m00s [143/171] Installing libcurl-0:8.15.0-2 100% | 294.4 MiB/s | 904.3 KiB | 00m00s [144/171] Installing elfutils-debuginfo 100% | 7.0 MiB/s | 86.2 KiB | 00m00s [145/171] Installing elfutils-0:0.193-3 100% | 162.1 MiB/s | 2.9 MiB | 00m00s [146/171] Installing binutils-0:2.45-1. 100% | 358.7 MiB/s | 26.5 MiB | 00m00s [147/171] Installing gdb-minimal-0:16.3 100% | 323.3 MiB/s | 13.3 MiB | 00m00s [148/171] Installing debugedit-0:5.2-3. 100% | 17.7 MiB/s | 217.3 KiB | 00m00s [149/171] Installing curl-0:8.15.0-2.fc 100% | 23.3 MiB/s | 476.3 KiB | 00m00s [150/171] Installing rpm-0:6.0.0-1.fc43 100% | 88.8 MiB/s | 2.6 MiB | 00m00s [151/171] Installing efi-srpm-macros-0: 100% | 40.2 MiB/s | 41.1 KiB | 00m00s [152/171] Installing java-srpm-macros-0 100% | 0.0 B/s | 1.1 KiB | 00m00s [153/171] Installing lua-srpm-macros-0: 100% | 0.0 B/s | 1.9 KiB | 00m00s [154/171] Installing tree-sitter-srpm-m 100% | 0.0 B/s | 9.3 KiB | 00m00s [155/171] Installing zig-srpm-macros-0: 100% | 0.0 B/s | 1.7 KiB | 00m00s [156/171] Installing gnupg2-dirmngr-0:2 100% | 31.9 MiB/s | 621.1 KiB | 00m00s [157/171] Installing gnupg2-0:2.4.8-4.f 100% | 242.6 MiB/s | 6.6 MiB | 00m00s [158/171] Installing rpm-sign-libs-0:6. 100% | 39.6 MiB/s | 40.6 KiB | 00m00s [159/171] Installing rpm-build-libs-0:6 100% | 262.9 MiB/s | 269.2 KiB | 00m00s [160/171] Installing gpgverify-0:2.2-3. 100% | 0.0 B/s | 9.4 KiB | 00m00s [161/171] Installing rpm-build-0:6.0.0- 100% | 24.1 MiB/s | 296.5 KiB | 00m00s [162/171] Installing pyproject-srpm-mac 100% | 0.0 B/s | 2.5 KiB | 00m00s [163/171] Installing redhat-rpm-config- 100% | 184.7 MiB/s | 189.1 KiB | 00m00s [164/171] Installing forge-srpm-macros- 100% | 0.0 B/s | 40.3 KiB | 00m00s [165/171] Installing fonts-srpm-macros- 100% | 0.0 B/s | 57.0 KiB | 00m00s [166/171] Installing go-srpm-macros-0:3 100% | 0.0 B/s | 63.0 KiB | 00m00s [167/171] Installing python-srpm-macros 100% | 0.0 B/s | 52.8 KiB | 00m00s [168/171] Installing which-0:2.23-3.fc4 100% | 6.4 MiB/s | 85.7 KiB | 00m00s [169/171] Installing util-linux-0:2.41. 100% | 111.6 MiB/s | 3.6 MiB | 00m00s [170/171] Installing shadow-utils-2:4.1 100% | 152.7 MiB/s | 4.0 MiB | 00m00s [171/171] Installing info-0:7.2-6.fc43. 100% | 234.0 KiB/s | 354.3 KiB | 00m02s Complete! Finish: installing minimal buildroot with dnf5 Start: creating root cache Finish: creating root cache Finish: chroot init INFO: Installed packages: INFO: add-determinism-0.6.0-2.fc43.x86_64 alternatives-1.33-2.fc43.x86_64 ansible-srpm-macros-1-18.1.fc43.noarch audit-libs-4.1.1-2.fc43.x86_64 bash-5.3.0-2.fc43.x86_64 binutils-2.45-1.fc43.x86_64 build-reproducibility-srpm-macros-0.6.0-2.fc43.noarch bzip2-1.0.8-21.fc43.x86_64 bzip2-libs-1.0.8-21.fc43.x86_64 ca-certificates-2025.2.80_v9.0.304-1.1.fc43.noarch coreutils-9.7-5.fc43.x86_64 coreutils-common-9.7-5.fc43.x86_64 cpio-2.15-6.fc43.x86_64 crypto-policies-20250714-5.gitcd6043a.fc43.noarch curl-8.15.0-2.fc43.x86_64 cyrus-sasl-lib-2.1.28-33.fc43.x86_64 debugedit-5.2-3.fc43.x86_64 diffutils-3.12-3.fc43.x86_64 dwz-0.16-2.fc43.x86_64 ed-1.22.2-1.fc43.x86_64 efi-srpm-macros-6-4.fc43.noarch elfutils-0.193-3.fc43.x86_64 elfutils-debuginfod-client-0.193-3.fc43.x86_64 elfutils-default-yama-scope-0.193-3.fc43.noarch elfutils-libelf-0.193-3.fc43.x86_64 elfutils-libs-0.193-3.fc43.x86_64 fedora-gpg-keys-43-0.4.noarch fedora-release-43-0.22.noarch fedora-release-common-43-0.22.noarch fedora-release-identity-basic-43-0.22.noarch fedora-repos-43-0.4.noarch file-5.46-8.fc43.x86_64 file-libs-5.46-8.fc43.x86_64 filesystem-3.18-50.fc43.x86_64 filesystem-srpm-macros-3.18-50.fc43.noarch findutils-4.10.0-6.fc43.x86_64 fonts-srpm-macros-2.0.5-23.fc43.noarch forge-srpm-macros-0.4.0-3.fc43.noarch fpc-srpm-macros-1.3-15.fc43.noarch gap-srpm-macros-1-1.fc43.noarch gawk-5.3.2-2.fc43.x86_64 gdb-minimal-16.3-6.fc43.x86_64 gdbm-libs-1.23-10.fc43.x86_64 ghc-srpm-macros-1.9.2-3.fc43.noarch glibc-2.42-4.fc43.x86_64 glibc-common-2.42-4.fc43.x86_64 glibc-gconv-extra-2.42-4.fc43.x86_64 glibc-minimal-langpack-2.42-4.fc43.x86_64 gmp-6.3.0-4.fc43.x86_64 gnat-srpm-macros-6-8.fc43.noarch gnupg2-2.4.8-4.fc43.x86_64 gnupg2-dirmngr-2.4.8-4.fc43.x86_64 gnupg2-gpg-agent-2.4.8-4.fc43.x86_64 gnupg2-gpgconf-2.4.8-4.fc43.x86_64 gnupg2-keyboxd-2.4.8-4.fc43.x86_64 gnupg2-verify-2.4.8-4.fc43.x86_64 gnutls-3.8.10-3.fc43.x86_64 go-srpm-macros-3.8.0-1.fc43.noarch gpg-pubkey-c6e7f081cf80e13146676e88829b606631645531-66b6dccf gpgverify-2.2-3.fc43.noarch grep-3.12-2.fc43.x86_64 gzip-1.13-4.fc43.x86_64 ima-evm-utils-libs-1.6.2-6.fc43.x86_64 info-7.2-6.fc43.x86_64 jansson-2.14-3.fc43.x86_64 java-srpm-macros-1-7.fc43.noarch json-c-0.18-7.fc43.x86_64 kernel-srpm-macros-1.0-27.fc43.noarch keyutils-libs-1.6.3-6.fc43.x86_64 krb5-libs-1.21.3-7.fc43.x86_64 libacl-2.3.2-4.fc43.x86_64 libarchive-3.8.1-3.fc43.x86_64 libassuan-2.5.7-4.fc43.x86_64 libattr-2.5.2-6.fc43.x86_64 libblkid-2.41.1-16.fc43.x86_64 libbrotli-1.1.0-10.fc43.x86_64 libcap-2.76-3.fc43.x86_64 libcap-ng-0.8.5-7.fc43.x86_64 libcom_err-1.47.3-2.fc43.x86_64 libcurl-8.15.0-2.fc43.x86_64 libeconf-0.7.9-2.fc43.x86_64 libevent-2.1.12-16.fc43.x86_64 libfdisk-2.41.1-16.fc43.x86_64 libffi-3.5.1-2.fc43.x86_64 libfsverity-1.6-3.fc43.x86_64 libgcc-15.2.1-2.fc43.x86_64 libgcrypt-1.11.1-2.fc43.x86_64 libgomp-15.2.1-2.fc43.x86_64 libgpg-error-1.55-2.fc43.x86_64 libidn2-2.3.8-2.fc43.x86_64 libksba-1.6.7-4.fc43.x86_64 liblastlog2-2.41.1-16.fc43.x86_64 libmount-2.41.1-16.fc43.x86_64 libnghttp2-1.66.0-2.fc43.x86_64 libpkgconf-2.3.0-3.fc43.x86_64 libpsl-0.21.5-6.fc43.x86_64 libselinux-3.9-5.fc43.x86_64 libsemanage-3.9-4.fc43.x86_64 libsepol-3.9-2.fc43.x86_64 libsmartcols-2.41.1-16.fc43.x86_64 libssh-0.11.3-1.fc43.x86_64 libssh-config-0.11.3-1.fc43.noarch libstdc++-15.2.1-2.fc43.x86_64 libtasn1-4.20.0-2.fc43.x86_64 libtool-ltdl-2.5.4-7.fc43.x86_64 libunistring-1.1-10.fc43.x86_64 libusb1-1.0.29-4.fc43.x86_64 libuuid-2.41.1-16.fc43.x86_64 libverto-0.3.2-11.fc43.x86_64 libxcrypt-4.4.38-8.fc43.x86_64 libxml2-2.12.10-4.fc43.x86_64 libzstd-1.5.7-2.fc43.x86_64 lua-libs-5.4.8-2.fc43.x86_64 lua-srpm-macros-1-16.fc43.noarch lz4-libs-1.10.0-3.fc43.x86_64 mpfr-4.2.2-2.fc43.x86_64 ncurses-base-6.5-7.20250614.fc43.noarch ncurses-libs-6.5-7.20250614.fc43.x86_64 nettle-3.10.1-2.fc43.x86_64 npth-1.8-3.fc43.x86_64 ocaml-srpm-macros-11-2.fc43.noarch openblas-srpm-macros-2-20.fc43.noarch openldap-2.6.10-4.fc43.x86_64 openssl-libs-3.5.1-2.fc43.x86_64 p11-kit-0.25.8-1.fc43.x86_64 p11-kit-trust-0.25.8-1.fc43.x86_64 package-notes-srpm-macros-0.5-14.fc43.noarch pam-libs-1.7.1-3.fc43.x86_64 patch-2.8-2.fc43.x86_64 pcre2-10.46-1.fc43.x86_64 pcre2-syntax-10.46-1.fc43.noarch perl-srpm-macros-1-60.fc43.noarch pkgconf-2.3.0-3.fc43.x86_64 pkgconf-m4-2.3.0-3.fc43.noarch pkgconf-pkg-config-2.3.0-3.fc43.x86_64 popt-1.19-9.fc43.x86_64 publicsuffix-list-dafsa-20250616-2.fc43.noarch pyproject-srpm-macros-1.18.4-1.fc43.noarch python-srpm-macros-3.14-5.fc43.noarch qt5-srpm-macros-5.15.17-2.fc43.noarch qt6-srpm-macros-6.9.2-1.fc43.noarch readline-8.3-2.fc43.x86_64 redhat-rpm-config-343-11.fc43.noarch rpm-6.0.0-1.fc43.x86_64 rpm-build-6.0.0-1.fc43.x86_64 rpm-build-libs-6.0.0-1.fc43.x86_64 rpm-libs-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 rpm-sign-libs-6.0.0-1.fc43.x86_64 rust-srpm-macros-26.4-1.fc43.noarch sed-4.9-5.fc43.x86_64 setup-2.15.0-26.fc43.noarch shadow-utils-4.18.0-3.fc43.x86_64 sqlite-libs-3.50.2-2.fc43.x86_64 systemd-libs-258-1.fc43.x86_64 systemd-standalone-sysusers-258-1.fc43.x86_64 tar-1.35-6.fc43.x86_64 tpm2-tss-4.1.3-8.fc43.x86_64 tree-sitter-srpm-macros-0.4.2-1.fc43.noarch unzip-6.0-67.fc43.x86_64 util-linux-2.41.1-16.fc43.x86_64 util-linux-core-2.41.1-16.fc43.x86_64 which-2.23-3.fc43.x86_64 xxhash-libs-0.8.3-3.fc43.x86_64 xz-5.8.1-2.fc43.x86_64 xz-libs-5.8.1-2.fc43.x86_64 zig-srpm-macros-1-5.fc43.noarch zip-3.0-44.fc43.x86_64 zlib-ng-compat-2.2.5-2.fc43.x86_64 zstd-1.5.7-2.fc43.x86_64 Start: buildsrpm Start: rpmbuild -bs Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1759363200 Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Finish: rpmbuild -bs INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1759434727.591343/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names Finish: buildsrpm INFO: Done(/var/lib/copr-rpmbuild/workspace/workdir-ox7e730h/ollama-ggml-cuda/ollama-ggml-cuda.spec) Config(child) 0 minutes 21 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot INFO: Start(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm) Config(fedora-43-x86_64) Start(bootstrap): chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root. INFO: reusing tmpfs at /var/lib/mock/fedora-43-x86_64-bootstrap-1759434727.591343/root. INFO: calling preinit hooks INFO: enabled root cache INFO: enabled package manager cache Start(bootstrap): cleaning package manager metadata Finish(bootstrap): cleaning package manager metadata Finish(bootstrap): chroot init Start: chroot init INFO: mounting tmpfs at /var/lib/mock/fedora-43-x86_64-1759434727.591343/root. INFO: calling preinit hooks INFO: enabled root cache Start: unpacking root cache Finish: unpacking root cache INFO: enabled package manager cache Start: cleaning package manager metadata Finish: cleaning package manager metadata INFO: enabled HW Info plugin INFO: Buildroot is handled by package management downloaded with a bootstrap image: rpm-6.0.0-1.fc43.x86_64 rpm-sequoia-1.9.0-2.fc43.x86_64 dnf5-5.2.17.0-2.fc43.x86_64 dnf5-plugins-5.2.17.0-2.fc43.x86_64 Finish: chroot init Start: build phase for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Start: build setup for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1759363200 Wrote: /builddir/build/SRPMS/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Updating and loading repositories: Additional repo https_developer_downlo 100% | 20.4 KiB/s | 3.9 KiB | 00m00s Additional repo https_developer_downlo 100% | 20.4 KiB/s | 3.9 KiB | 00m00s Copr repository 100% | 7.8 KiB/s | 1.5 KiB | 00m00s fedora 100% | 31.8 KiB/s | 10.3 KiB | 00m00s updates 100% | 117.4 KiB/s | 30.2 KiB | 00m00s Repositories loaded. Package Arch Version Repository Size Installing: cmake x86_64 3.31.6-4.fc43 fedora 34.5 MiB cuda-compiler-12-9 x86_64 12.9.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 0.0 B cuda-compiler-13-0 x86_64 13.0.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 0.0 B cuda-libraries-devel-12-9 x86_64 12.9.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 0.0 B cuda-libraries-devel-13-0 x86_64 13.0.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 0.0 B cuda-nvml-devel-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 1.4 MiB cuda-nvml-devel-13-0 x86_64 13.0.87-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 1.4 MiB gcc-c++ x86_64 15.2.1-2.fc43 fedora 41.4 MiB gcc14 x86_64 14.3.1-1.fc43 fedora 117.6 MiB gcc14-c++ x86_64 14.3.1-1.fc43 fedora 124.1 MiB Installing dependencies: annobin-docs noarch 12.99-1.fc43 fedora 98.9 KiB annobin-plugin-gcc x86_64 12.99-1.fc43 fedora 1.0 MiB cmake-data noarch 3.31.6-4.fc43 fedora 8.5 MiB cmake-filesystem x86_64 3.31.6-4.fc43 fedora 0.0 B cmake-rpm-macros noarch 3.31.6-4.fc43 fedora 7.7 KiB cpp x86_64 15.2.1-2.fc43 fedora 37.9 MiB cuda-cccl-12-9 x86_64 12.9.27-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 12.7 MiB cuda-cccl-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 13.2 MiB cuda-crt-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 928.8 KiB cuda-crt-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 936.8 KiB cuda-cudart-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 785.8 KiB cuda-cudart-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 754.1 KiB cuda-cudart-devel-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 8.5 MiB cuda-cudart-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 6.2 MiB cuda-culibos-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 96.4 KiB cuda-cuobjdump-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 665.7 KiB cuda-cuobjdump-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 750.4 KiB cuda-cuxxfilt-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 1.0 MiB cuda-cuxxfilt-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 1.0 MiB cuda-driver-devel-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 131.0 KiB cuda-driver-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 135.3 KiB cuda-nvcc-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 317.8 MiB cuda-nvcc-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 111.0 MiB cuda-nvprune-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 181.0 KiB cuda-nvprune-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 181.3 KiB cuda-nvrtc-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 216.9 MiB cuda-nvrtc-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 217.4 MiB cuda-nvrtc-devel-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 248.0 MiB cuda-nvrtc-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 244.5 MiB cuda-nvvm-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 132.6 MiB cuda-opencl-12-9 x86_64 12.9.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 91.7 KiB cuda-opencl-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 96.5 KiB cuda-opencl-devel-12-9 x86_64 12.9.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 741.1 KiB cuda-opencl-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 747.9 KiB cuda-profiler-api-12-9 x86_64 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 73.4 KiB cuda-profiler-api-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 77.6 KiB cuda-sandbox-devel-12-9 x86_64 12.9.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 146.3 KiB cuda-sandbox-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 149.4 KiB cuda-toolkit-12-9-config-common noarch 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 0.0 B cuda-toolkit-12-config-common noarch 12.9.79-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 44.0 B cuda-toolkit-13-0-config-common noarch 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 0.0 B cuda-toolkit-13-config-common noarch 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 44.0 B cuda-toolkit-config-common noarch 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 41.0 B emacs-filesystem noarch 1:30.0-5.fc43 fedora 0.0 B expat x86_64 2.7.2-1.fc43 fedora 298.6 KiB gcc x86_64 15.2.1-2.fc43 fedora 111.9 MiB gcc-plugin-annobin x86_64 15.2.1-2.fc43 fedora 57.2 KiB glibc-devel x86_64 2.42-4.fc43 fedora 2.3 MiB jsoncpp x86_64 1.9.6-2.fc43 fedora 257.6 KiB kernel-headers x86_64 6.17.0-63.fc43 fedora 6.7 MiB libcublas-12-9 x86_64 12.9.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 815.6 MiB libcublas-13-0 x86_64 13.0.2.14-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 567.2 MiB libcublas-devel-12-9 x86_64 12.9.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 1.2 GiB libcublas-devel-13-0 x86_64 13.0.2.14-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 961.6 MiB libcufft-12-9 x86_64 11.4.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 277.2 MiB libcufft-13-0 x86_64 12.0.0.61-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 274.3 MiB libcufft-devel-12-9 x86_64 11.4.1.4-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 567.3 MiB libcufft-devel-13-0 x86_64 12.0.0.61-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 280.5 MiB libcufile-12-9 x86_64 1.14.1.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 3.2 MiB libcufile-13-0 x86_64 1.15.1.6-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 3.2 MiB libcufile-devel-12-9 x86_64 1.14.1.1-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 27.9 MiB libcufile-devel-13-0 x86_64 1.15.1.6-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 27.9 MiB libcurand-12-9 x86_64 10.3.10.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 159.3 MiB libcurand-13-0 x86_64 10.4.0.35-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 126.6 MiB libcurand-devel-12-9 x86_64 10.3.10.19-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 161.3 MiB libcurand-devel-13-0 x86_64 10.4.0.35-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 129.0 MiB libcusolver-12-9 x86_64 11.7.5.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 470.6 MiB libcusolver-13-0 x86_64 12.0.4.66-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 233.8 MiB libcusolver-devel-12-9 x86_64 11.7.5.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 332.5 MiB libcusolver-devel-13-0 x86_64 12.0.4.66-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 180.9 MiB libcusparse-12-9 x86_64 12.5.10.65-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 463.0 MiB libcusparse-13-0 x86_64 12.6.3.3-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 155.1 MiB libcusparse-devel-12-9 x86_64 12.5.10.65-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 960.3 MiB libcusparse-devel-13-0 x86_64 12.6.3.3-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 348.7 MiB libmpc x86_64 1.3.1-8.fc43 fedora 160.6 KiB libnpp-12-9 x86_64 12.4.1.87-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 393.0 MiB libnpp-13-0 x86_64 13.0.1.2-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 157.3 MiB libnpp-devel-12-9 x86_64 12.4.1.87-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 406.2 MiB libnpp-devel-13-0 x86_64 13.0.1.2-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 184.5 MiB libnvfatbin-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 2.4 MiB libnvfatbin-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 2.4 MiB libnvfatbin-devel-12-9 x86_64 12.9.82-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 2.3 MiB libnvfatbin-devel-13-0 x86_64 13.0.85-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 2.3 MiB libnvjitlink-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 91.6 MiB libnvjitlink-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 94.3 MiB libnvjitlink-devel-12-9 x86_64 12.9.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 127.6 MiB libnvjitlink-devel-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 130.0 MiB libnvjpeg-12-9 x86_64 12.4.0.76-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 9.0 MiB libnvjpeg-13-0 x86_64 13.0.1.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 5.7 MiB libnvjpeg-devel-12-9 x86_64 12.4.0.76-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64 9.4 MiB libnvjpeg-devel-13-0 x86_64 13.0.1.86-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 6.4 MiB libnvptxcompiler-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 85.4 MiB libnvvm-13-0 x86_64 13.0.88-1 https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 133.6 MiB libstdc++-devel x86_64 15.2.1-2.fc43 fedora 37.3 MiB libuv x86_64 1:1.51.0-2.fc43 fedora 570.2 KiB libxcrypt-devel x86_64 4.4.38-8.fc43 fedora 30.8 KiB make x86_64 1:4.4.1-11.fc43 fedora 1.8 MiB mpdecimal x86_64 4.0.1-2.fc43 fedora 217.2 KiB python-pip-wheel noarch 25.1.1-16.fc43 fedora 1.2 MiB python3 x86_64 3.14.0~rc3-1.fc43 fedora 28.9 KiB python3-libs x86_64 3.14.0~rc3-1.fc43 fedora 43.0 MiB rhash x86_64 1.4.5-3.fc43 fedora 351.1 KiB tzdata noarch 2025b-3.fc43 fedora 1.6 MiB vim-filesystem noarch 2:9.1.1775-1.fc43 fedora 40.0 B Transaction Summary: Installing: 114 packages Total size of inbound packages is 7 GiB. Need to download 7 GiB. After this operation, 12 GiB extra will be used (install 12 GiB, remove 0 B). [ 1/114] cmake-0:3.31.6-4.fc43.x86_64 100% | 210.9 MiB/s | 12.2 MiB | 00m00s [ 2/114] gcc14-c++-0:14.3.1-1.fc43.x86 100% | 137.0 MiB/s | 25.9 MiB | 00m00s [ 3/114] cuda-compiler-12-9-0:12.9.1-1 100% | 37.8 KiB/s | 7.4 KiB | 00m00s [ 4/114] gcc14-0:14.3.1-1.fc43.x86_64 100% | 149.2 MiB/s | 43.9 MiB | 00m00s [ 5/114] cuda-compiler-13-0-0:13.0.1-1 100% | 71.7 KiB/s | 7.5 KiB | 00m00s [ 6/114] cuda-libraries-devel-12-9-0:1 100% | 116.2 KiB/s | 7.9 KiB | 00m00s [ 7/114] cuda-libraries-devel-13-0-0:1 100% | 122.1 KiB/s | 7.9 KiB | 00m00s [ 8/114] gcc-c++-0:15.2.1-2.fc43.x86_6 100% | 200.7 MiB/s | 15.3 MiB | 00m00s [ 9/114] libmpc-0:1.3.1-8.fc43.x86_64 100% | 22.9 MiB/s | 70.4 KiB | 00m00s [ 10/114] make-1:4.4.1-11.fc43.x86_64 100% | 114.3 MiB/s | 585.2 KiB | 00m00s [ 11/114] cmake-data-0:3.31.6-4.fc43.no 100% | 117.5 MiB/s | 2.5 MiB | 00m00s [ 12/114] cmake-filesystem-0:3.31.6-4.f 100% | 5.0 MiB/s | 15.5 KiB | 00m00s [ 13/114] expat-0:2.7.2-1.fc43.x86_64 100% | 23.2 MiB/s | 118.9 KiB | 00m00s [ 14/114] jsoncpp-0:1.9.6-2.fc43.x86_64 100% | 16.4 MiB/s | 101.1 KiB | 00m00s [ 15/114] libuv-1:1.51.0-2.fc43.x86_64 100% | 52.0 MiB/s | 266.1 KiB | 00m00s [ 16/114] rhash-0:1.4.5-3.fc43.x86_64 100% | 17.6 MiB/s | 197.9 KiB | 00m00s [ 17/114] cuda-nvml-devel-12-9-0:12.9.7 100% | 953.4 KiB/s | 201.2 KiB | 00m00s [ 18/114] cuda-nvml-devel-13-0-0:13.0.8 100% | 915.8 KiB/s | 218.9 KiB | 00m00s [ 19/114] cuda-cuobjdump-12-9-0:12.9.82 100% | 1.8 MiB/s | 277.9 KiB | 00m00s [ 20/114] cuda-cuxxfilt-12-9-0:12.9.82- 100% | 1.4 MiB/s | 282.8 KiB | 00m00s [ 21/114] cuda-nvprune-12-9-0:12.9.82-1 100% | 296.9 KiB/s | 76.0 KiB | 00m00s [ 22/114] cuda-crt-13-0-0:13.0.88-1.x86 100% | 260.6 KiB/s | 120.9 KiB | 00m00s [ 23/114] cuda-cuobjdump-13-0-0:13.0.85 100% | 753.0 KiB/s | 309.5 KiB | 00m00s [ 24/114] cuda-cuxxfilt-13-0-0:13.0.85- 100% | 1.5 MiB/s | 283.6 KiB | 00m00s [ 25/114] cuda-nvprune-13-0-0:13.0.85-1 100% | 503.8 KiB/s | 76.6 KiB | 00m00s [ 26/114] cuda-nvcc-13-0-0:13.0.88-1.x8 100% | 57.7 MiB/s | 35.3 MiB | 00m01s [ 27/114] libnvptxcompiler-13-0-0:13.0. 100% | 44.9 MiB/s | 21.3 MiB | 00m00s [ 28/114] cuda-cccl-12-9-0:12.9.27-1.x8 100% | 10.4 MiB/s | 1.7 MiB | 00m00s [ 29/114] cuda-nvcc-12-9-0:12.9.86-1.x8 100% | 59.3 MiB/s | 111.3 MiB | 00m02s [ 30/114] cuda-driver-devel-12-9-0:12.9 100% | 683.7 KiB/s | 43.1 KiB | 00m00s [ 31/114] cuda-cudart-devel-12-9-0:12.9 100% | 7.5 MiB/s | 3.0 MiB | 00m00s [ 32/114] libnvvm-13-0-0:13.0.88-1.x86_ 100% | 77.3 MiB/s | 58.3 MiB | 00m01s [ 33/114] cuda-opencl-devel-12-9-0:12.9 100% | 823.7 KiB/s | 119.4 KiB | 00m00s [ 34/114] cuda-profiler-api-12-9-0:12.9 100% | 391.5 KiB/s | 26.2 KiB | 00m00s [ 35/114] cuda-sandbox-devel-12-9-0:12. 100% | 631.9 KiB/s | 44.2 KiB | 00m00s [ 36/114] cuda-nvrtc-devel-12-9-0:12.9. 100% | 60.7 MiB/s | 74.2 MiB | 00m01s [ 37/114] libcufile-devel-12-9-0:1.14.1 100% | 25.4 MiB/s | 5.2 MiB | 00m00s [ 38/114] libcurand-devel-12-9-0:10.3.1 100% | 50.1 MiB/s | 64.2 MiB | 00m01s [ 39/114] libcusolver-devel-12-9-0:11.7 100% | 56.7 MiB/s | 213.1 MiB | 00m04s [ 40/114] libcufft-devel-12-9-0:11.4.1. 100% | 57.7 MiB/s | 385.6 MiB | 00m07s [ 41/114] libcublas-devel-12-9-0:12.9.1 100% | 59.5 MiB/s | 630.3 MiB | 00m11s [ 42/114] libnvfatbin-devel-12-9-0:12.9 100% | 2.8 MiB/s | 863.8 KiB | 00m00s [ 43/114] libnvjitlink-devel-12-9-0:12. 100% | 54.4 MiB/s | 36.1 MiB | 00m01s [ 44/114] libnpp-devel-12-9-0:12.4.1.87 100% | 52.4 MiB/s | 268.0 MiB | 00m05s [ 45/114] libnvjpeg-devel-12-9-0:12.4.0 100% | 12.9 MiB/s | 4.9 MiB | 00m00s [ 46/114] cuda-cccl-13-0-0:13.0.85-1.x8 100% | 10.6 MiB/s | 1.7 MiB | 00m00s [ 47/114] cuda-culibos-devel-13-0-0:13. 100% | 478.0 KiB/s | 32.5 KiB | 00m00s [ 48/114] cuda-cudart-devel-13-0-0:13.0 100% | 11.8 MiB/s | 1.9 MiB | 00m00s [ 49/114] cuda-driver-devel-13-0-0:13.0 100% | 475.8 KiB/s | 44.3 KiB | 00m00s [ 50/114] cuda-opencl-devel-13-0-0:13.0 100% | 875.2 KiB/s | 120.8 KiB | 00m00s [ 51/114] cuda-profiler-api-13-0-0:13.0 100% | 360.8 KiB/s | 27.1 KiB | 00m00s [ 52/114] cuda-sandbox-devel-13-0-0:13. 100% | 666.7 KiB/s | 45.3 KiB | 00m00s [ 53/114] cuda-nvrtc-devel-13-0-0:13.0. 100% | 50.7 MiB/s | 73.7 MiB | 00m01s [ 54/114] libcufft-devel-13-0-0:12.0.0. 100% | 57.4 MiB/s | 205.4 MiB | 00m04s [ 55/114] libcufile-devel-13-0-0:1.15.1 100% | 14.8 MiB/s | 5.2 MiB | 00m00s [ 56/114] libcusparse-devel-12-9-0:12.5 100% | 58.8 MiB/s | 710.9 MiB | 00m12s [ 57/114] libcurand-devel-13-0-0:10.4.0 100% | 34.5 MiB/s | 56.0 MiB | 00m02s [ 58/114] libcusolver-devel-13-0-0:12.0 100% | 55.0 MiB/s | 124.4 MiB | 00m02s [ 59/114] libcublas-devel-13-0-0:13.0.2 100% | 54.3 MiB/s | 470.7 MiB | 00m09s [ 60/114] libnvfatbin-devel-13-0-0:13.0 100% | 4.6 MiB/s | 877.4 KiB | 00m00s [ 61/114] libnvjitlink-devel-13-0-0:13. 100% | 46.8 MiB/s | 36.7 MiB | 00m01s [ 62/114] libnvjpeg-devel-13-0-0:13.0.1 100% | 15.8 MiB/s | 3.4 MiB | 00m00s [ 63/114] gcc-0:15.2.1-2.fc43.x86_64 100% | 252.9 MiB/s | 39.7 MiB | 00m00s [ 64/114] emacs-filesystem-1:30.0-5.fc4 100% | 2.4 MiB/s | 7.5 KiB | 00m00s [ 65/114] vim-filesystem-2:9.1.1775-1.f 100% | 3.8 MiB/s | 15.4 KiB | 00m00s [ 66/114] cuda-crt-12-9-0:12.9.86-1.x86 100% | 814.1 KiB/s | 119.7 KiB | 00m00s [ 67/114] libnpp-devel-13-0-0:13.0.1.2- 100% | 58.7 MiB/s | 125.6 MiB | 00m02s [ 68/114] cuda-cudart-12-9-0:12.9.79-1. 100% | 1.7 MiB/s | 236.8 KiB | 00m00s [ 69/114] libcusparse-devel-13-0-0:12.6 100% | 61.0 MiB/s | 286.7 MiB | 00m05s [ 70/114] cuda-nvvm-12-9-0:12.9.86-1.x8 100% | 45.7 MiB/s | 57.6 MiB | 00m01s [ 71/114] cuda-opencl-12-9-0:12.9.19-1. 100% | 503.6 KiB/s | 34.2 KiB | 00m00s [ 72/114] cuda-nvrtc-12-9-0:12.9.86-1.x 100% | 52.5 MiB/s | 84.8 MiB | 00m02s [ 73/114] libcufile-12-9-0:1.14.1.1-1.x 100% | 7.5 MiB/s | 1.2 MiB | 00m00s [ 74/114] libcurand-12-9-0:10.3.10.19-1 100% | 41.9 MiB/s | 63.9 MiB | 00m02s [ 75/114] libcufft-12-9-0:11.4.1.4-1.x8 100% | 53.0 MiB/s | 191.7 MiB | 00m04s [ 76/114] libcusolver-12-9-0:11.7.5.82- 100% | 59.2 MiB/s | 324.9 MiB | 00m05s [ 77/114] libcublas-12-9-0:12.9.1.4-1.x 100% | 57.1 MiB/s | 555.4 MiB | 00m10s [ 78/114] libcusparse-12-9-0:12.5.10.65 100% | 54.8 MiB/s | 351.7 MiB | 00m06s [ 79/114] libnvfatbin-12-9-0:12.9.82-1. 100% | 886.9 KiB/s | 940.1 KiB | 00m01s [ 80/114] libnvjpeg-12-9-0:12.4.0.76-1. 100% | 3.3 MiB/s | 5.1 MiB | 00m02s [ 81/114] cuda-cudart-13-0-0:13.0.88-1. 100% | 929.7 KiB/s | 223.1 KiB | 00m00s [ 82/114] libnvjitlink-12-9-0:12.9.86-1 100% | 14.1 MiB/s | 37.6 MiB | 00m03s [ 83/114] cuda-opencl-13-0-0:13.0.85-1. 100% | 504.0 KiB/s | 35.3 KiB | 00m00s [ 84/114] cuda-nvrtc-13-0-0:13.0.88-1.x 100% | 62.5 MiB/s | 85.4 MiB | 00m01s [ 85/114] libnpp-12-9-0:12.4.1.87-1.x86 100% | 38.6 MiB/s | 271.1 MiB | 00m07s [ 86/114] libcufile-13-0-0:1.15.1.6-1.x 100% | 692.1 KiB/s | 1.2 MiB | 00m02s [ 87/114] libcurand-13-0-0:10.4.0.35-1. 100% | 41.2 MiB/s | 55.7 MiB | 00m01s [ 88/114] libcufft-13-0-0:12.0.0.61-1.x 100% | 40.5 MiB/s | 204.4 MiB | 00m05s [ 89/114] libcublas-13-0-0:13.0.2.14-1. 100% | 44.2 MiB/s | 401.1 MiB | 00m09s [ 90/114] libcusolver-13-0-0:12.0.4.66- 100% | 39.9 MiB/s | 191.4 MiB | 00m05s [ 91/114] libcusparse-13-0-0:12.6.3.3-1 100% | 36.5 MiB/s | 139.2 MiB | 00m04s [ 92/114] libnvfatbin-13-0-0:13.0.85-1. 100% | 4.3 MiB/s | 950.0 KiB | 00m00s [ 93/114] libnvjpeg-13-0-0:13.0.1.86-1. 100% | 16.3 MiB/s | 3.5 MiB | 00m00s [ 94/114] cpp-0:15.2.1-2.fc43.x86_64 100% | 263.8 MiB/s | 12.9 MiB | 00m00s [ 95/114] libstdc++-devel-0:15.2.1-2.fc 100% | 120.3 MiB/s | 5.3 MiB | 00m00s [ 96/114] glibc-devel-0:2.42-4.fc43.x86 100% | 110.5 MiB/s | 565.9 KiB | 00m00s [ 97/114] libxcrypt-devel-0:4.4.38-8.fc 100% | 9.5 MiB/s | 29.2 KiB | 00m00s [ 98/114] cuda-toolkit-config-common-0: 100% | 102.0 KiB/s | 8.0 KiB | 00m00s [ 99/114] cuda-toolkit-13-0-config-comm 100% | 103.6 KiB/s | 7.8 KiB | 00m00s [100/114] libnvjitlink-13-0-0:13.0.88-1 100% | 60.9 MiB/s | 38.5 MiB | 00m01s [101/114] cuda-toolkit-13-config-common 100% | 110.7 KiB/s | 8.0 KiB | 00m00s [102/114] cuda-toolkit-12-9-config-comm 100% | 114.2 KiB/s | 7.8 KiB | 00m00s [103/114] kernel-headers-0:6.17.0-63.fc 100% | 188.6 MiB/s | 1.7 MiB | 00m00s [104/114] annobin-plugin-gcc-0:12.99-1. 100% | 138.9 MiB/s | 996.0 KiB | 00m00s [105/114] gcc-plugin-annobin-0:15.2.1-2 100% | 18.6 MiB/s | 57.1 KiB | 00m00s [106/114] annobin-docs-0:12.99-1.fc43.n 100% | 17.5 MiB/s | 89.5 KiB | 00m00s [107/114] cmake-rpm-macros-0:3.31.6-4.f 100% | 7.2 MiB/s | 14.8 KiB | 00m00s [108/114] python3-0:3.14.0~rc3-1.fc43.x 100% | 13.5 MiB/s | 27.6 KiB | 00m00s [109/114] python3-libs-0:3.14.0~rc3-1.f 100% | 233.9 MiB/s | 9.8 MiB | 00m00s [110/114] mpdecimal-0:4.0.1-2.fc43.x86_ 100% | 23.7 MiB/s | 97.1 KiB | 00m00s [111/114] python-pip-wheel-0:25.1.1-16. 100% | 200.8 MiB/s | 1.2 MiB | 00m00s [112/114] tzdata-0:2025b-3.fc43.noarch 100% | 77.5 MiB/s | 713.9 KiB | 00m00s [113/114] cuda-toolkit-12-config-common 100% | 45.8 KiB/s | 8.0 KiB | 00m00s [114/114] libnpp-13-0-0:13.0.1.2-1.x86_ 100% | 63.6 MiB/s | 127.8 MiB | 00m02s -------------------------------------------------------------------------------- [114/114] Total 100% | 146.3 MiB/s | 7.2 GiB | 00m50s Running transaction [ 1/116] Verify package files 100% | 2.0 B/s | 114.0 B | 00m52s [ 2/116] Prepare transaction 100% | 1.7 KiB/s | 114.0 B | 00m00s [ 3/116] Installing cuda-toolkit-confi 100% | 304.7 KiB/s | 312.0 B | 00m00s [ 4/116] Installing cuda-toolkit-12-co 100% | 0.0 B/s | 316.0 B | 00m00s [ 5/116] Installing cuda-toolkit-12-9- 100% | 0.0 B/s | 124.0 B | 00m00s [ 6/116] Installing cuda-toolkit-13-co 100% | 0.0 B/s | 316.0 B | 00m00s [ 7/116] Installing cuda-toolkit-13-0- 100% | 0.0 B/s | 124.0 B | 00m00s [ 8/116] Installing cuda-culibos-devel 100% | 0.0 B/s | 97.0 KiB | 00m00s [ 9/116] Installing libmpc-0:1.3.1-8.f 100% | 79.1 MiB/s | 162.1 KiB | 00m00s [ 10/116] Installing make-1:4.4.1-11.fc 100% | 72.0 MiB/s | 1.8 MiB | 00m00s [ 11/116] Installing libstdc++-devel-0: 100% | 451.6 MiB/s | 37.5 MiB | 00m00s [ 12/116] Installing cuda-cccl-13-0-0:1 100% | 212.3 MiB/s | 13.6 MiB | 00m00s [ 13/116] Installing cuda-cccl-12-9-0:1 100% | 111.6 MiB/s | 13.1 MiB | 00m00s [ 14/116] Installing libnvvm-13-0-0:13. 100% | 65.1 MiB/s | 133.6 MiB | 00m02s [ 15/116] Installing libnvptxcompiler-1 100% | 76.0 MiB/s | 85.4 MiB | 00m01s [ 16/116] Installing cuda-crt-13-0-0:13 100% | 153.3 MiB/s | 942.2 KiB | 00m00s [ 17/116] Installing expat-0:2.7.2-1.fc 100% | 19.6 MiB/s | 300.7 KiB | 00m00s [ 18/116] Installing cmake-filesystem-0 100% | 7.4 MiB/s | 7.6 KiB | 00m00s [ 19/116] Installing cpp-0:15.2.1-2.fc4 100% | 341.9 MiB/s | 38.0 MiB | 00m00s [ 20/116] Installing cuda-sandbox-devel 100% | 148.2 MiB/s | 151.7 KiB | 00m00s [ 21/116] Installing cuda-cudart-13-0-0 100% | 82.0 MiB/s | 755.6 KiB | 00m00s [ 22/116] Installing cuda-cudart-devel- 100% | 298.2 MiB/s | 6.3 MiB | 00m00s [ 23/116] Installing cuda-opencl-13-0-0 100% | 19.2 MiB/s | 98.1 KiB | 00m00s [ 24/116] Installing cuda-opencl-devel- 100% | 244.6 MiB/s | 751.3 KiB | 00m00s [ 25/116] Installing libcublas-13-0-0:1 100% | 132.9 MiB/s | 567.2 MiB | 00m04s [ 26/116] Installing libcublas-devel-13 100% | 60.1 MiB/s | 961.6 MiB | 00m16s [ 27/116] Installing libcufft-13-0-0:12 100% | 187.6 MiB/s | 274.3 MiB | 00m01s [ 28/116] Installing libcufft-devel-13- 100% | 42.5 MiB/s | 280.5 MiB | 00m07s [ 29/116] Installing libcufile-13-0-0:1 100% | 169.0 MiB/s | 3.2 MiB | 00m00s [ 30/116] Installing libcufile-devel-13 100% | 172.3 MiB/s | 27.9 MiB | 00m00s [ 31/116] Installing libcurand-13-0-0:1 100% | 368.1 MiB/s | 126.6 MiB | 00m00s [ 32/116] Installing libcurand-devel-13 100% | 61.5 MiB/s | 129.0 MiB | 00m02s [ 33/116] Installing libcusolver-13-0-0 100% | 190.6 MiB/s | 233.8 MiB | 00m01s [ 34/116] Installing libcusolver-devel- 100% | 42.7 MiB/s | 180.9 MiB | 00m04s [ 35/116] Installing libcusparse-13-0-0 100% | 281.5 MiB/s | 155.1 MiB | 00m01s [ 36/116] Installing libcusparse-devel- 100% | 41.3 MiB/s | 348.7 MiB | 00m08s [ 37/116] Installing libnpp-13-0-0:13.0 100% | 315.3 MiB/s | 157.4 MiB | 00m00s [ 38/116] Installing libnpp-devel-13-0- 100% | 52.4 MiB/s | 184.5 MiB | 00m04s [ 39/116] Installing libnvfatbin-13-0-0 100% | 161.3 MiB/s | 2.4 MiB | 00m00s [ 40/116] Installing libnvfatbin-devel- 100% | 101.8 MiB/s | 2.3 MiB | 00m00s [ 41/116] Installing libnvjitlink-13-0- 100% | 250.1 MiB/s | 94.3 MiB | 00m00s [ 42/116] Installing libnvjitlink-devel 100% | 84.0 MiB/s | 130.0 MiB | 00m02s [ 43/116] Installing libnvjpeg-13-0-0:1 100% | 217.9 MiB/s | 5.7 MiB | 00m00s [ 44/116] Installing libnvjpeg-devel-13 100% | 22.5 MiB/s | 6.4 MiB | 00m00s [ 45/116] Installing cuda-sandbox-devel 100% | 72.6 MiB/s | 148.6 KiB | 00m00s [ 46/116] Installing cuda-cudart-12-9-0 100% | 59.1 MiB/s | 787.3 KiB | 00m00s [ 47/116] Installing cuda-cudart-devel- 100% | 94.2 MiB/s | 8.5 MiB | 00m00s [ 48/116] Installing cuda-opencl-12-9-0 100% | 15.2 MiB/s | 93.4 KiB | 00m00s [ 49/116] Installing cuda-opencl-devel- 100% | 181.8 MiB/s | 744.4 KiB | 00m00s [ 50/116] Installing libcublas-12-9-0:1 100% | 48.2 MiB/s | 815.6 MiB | 00m17s [ 51/116] Installing libcublas-devel-12 100% | 58.2 MiB/s | 1.2 GiB | 00m21s [ 52/116] Installing libcufft-12-9-0:11 100% | 45.0 MiB/s | 277.2 MiB | 00m06s [ 53/116] Installing libcufft-devel-12- 100% | 47.9 MiB/s | 567.3 MiB | 00m12s [ 54/116] Installing libcufile-12-9-0:1 100% | 15.8 MiB/s | 3.2 MiB | 00m00s [ 55/116] Installing libcufile-devel-12 100% | 120.3 MiB/s | 27.9 MiB | 00m00s [ 56/116] Installing libcurand-12-9-0:1 100% | 83.7 MiB/s | 159.3 MiB | 00m02s [ 57/116] Installing libcurand-devel-12 100% | 79.4 MiB/s | 161.3 MiB | 00m02s [ 58/116] Installing libcusolver-12-9-0 100% | 43.9 MiB/s | 470.6 MiB | 00m11s [ 59/116] Installing libcusolver-devel- 100% | 49.4 MiB/s | 332.5 MiB | 00m07s [ 60/116] Installing libcusparse-12-9-0 100% | 42.4 MiB/s | 463.0 MiB | 00m11s [ 61/116] Installing libcusparse-devel- 100% | 42.9 MiB/s | 960.3 MiB | 00m22s [ 62/116] Installing libnpp-12-9-0:12.4 100% | 45.3 MiB/s | 393.0 MiB | 00m09s [ 63/116] Installing libnpp-devel-12-9- 100% | 46.0 MiB/s | 406.2 MiB | 00m09s [ 64/116] Installing libnvfatbin-12-9-0 100% | 70.5 MiB/s | 2.4 MiB | 00m00s [ 65/116] Installing libnvfatbin-devel- 100% | 96.2 MiB/s | 2.3 MiB | 00m00s [ 66/116] Installing libnvjitlink-12-9- 100% | 76.7 MiB/s | 91.6 MiB | 00m01s [ 67/116] Installing libnvjitlink-devel 100% | 102.6 MiB/s | 127.6 MiB | 00m01s [ 68/116] Installing libnvjpeg-12-9-0:1 100% | 59.5 MiB/s | 9.0 MiB | 00m00s [ 69/116] Installing libnvjpeg-devel-12 100% | 44.7 MiB/s | 9.4 MiB | 00m00s [ 70/116] Installing tzdata-0:2025b-3.f 100% | 24.9 MiB/s | 1.9 MiB | 00m00s [ 71/116] Installing python-pip-wheel-0 100% | 177.9 MiB/s | 1.2 MiB | 00m00s [ 72/116] Installing mpdecimal-0:4.0.1- 100% | 16.4 MiB/s | 218.8 KiB | 00m00s [ 73/116] Installing python3-libs-0:3.1 100% | 68.6 MiB/s | 43.3 MiB | 00m01s [ 74/116] Installing python3-0:3.14.0~r 100% | 309.8 KiB/s | 30.7 KiB | 00m00s [ 75/116] Installing cmake-rpm-macros-0 100% | 2.7 MiB/s | 8.3 KiB | 00m00s [ 76/116] Installing annobin-docs-0:12. 100% | 12.2 MiB/s | 100.1 KiB | 00m00s [ 77/116] Installing kernel-headers-0:6 100% | 191.0 MiB/s | 6.9 MiB | 00m00s [ 78/116] Installing glibc-devel-0:2.42 100% | 147.1 MiB/s | 2.4 MiB | 00m00s [ 79/116] Installing libxcrypt-devel-0: 100% | 3.2 MiB/s | 33.1 KiB | 00m00s [ 80/116] Installing gcc-0:15.2.1-2.fc4 100% | 82.2 MiB/s | 111.9 MiB | 00m01s [ 81/116] Installing gcc-c++-0:15.2.1-2 100% | 78.8 MiB/s | 41.4 MiB | 00m01s [ 82/116] Installing cuda-nvcc-13-0-0:1 100% | 69.0 MiB/s | 111.0 MiB | 00m02s [ 83/116] Installing gcc14-0:14.3.1-1.f 100% | 71.5 MiB/s | 117.7 MiB | 00m02s [ 84/116] Installing cuda-nvrtc-13-0-0: 100% | 66.2 MiB/s | 217.4 MiB | 00m03s [ 85/116] Installing cuda-nvrtc-devel-1 100% | 79.6 MiB/s | 244.5 MiB | 00m03s [ 86/116] Installing cuda-nvrtc-12-9-0: 100% | 71.1 MiB/s | 216.9 MiB | 00m03s [ 87/116] Installing cuda-nvrtc-devel-1 100% | 86.8 MiB/s | 248.0 MiB | 00m03s [ 88/116] Installing cuda-nvvm-12-9-0:1 100% | 60.4 MiB/s | 132.7 MiB | 00m02s [ 89/116] Installing cuda-crt-12-9-0:12 100% | 114.0 MiB/s | 933.9 KiB | 00m00s [ 90/116] Installing cuda-nvcc-12-9-0:1 100% | 89.0 MiB/s | 317.8 MiB | 00m04s [ 91/116] Installing vim-filesystem-2:9 100% | 1.2 MiB/s | 4.7 KiB | 00m00s [ 92/116] Installing emacs-filesystem-1 100% | 265.6 KiB/s | 544.0 B | 00m00s [ 93/116] Installing cuda-profiler-api- 100% | 38.6 MiB/s | 79.1 KiB | 00m00s [ 94/116] Installing cuda-driver-devel- 100% | 33.5 MiB/s | 137.0 KiB | 00m00s [ 95/116] Installing cuda-profiler-api- 100% | 24.4 MiB/s | 74.9 KiB | 00m00s [ 96/116] Installing cuda-driver-devel- 100% | 43.2 MiB/s | 132.8 KiB | 00m00s [ 97/116] Installing cuda-nvprune-13-0- 100% | 88.9 MiB/s | 182.1 KiB | 00m00s [ 98/116] Installing cuda-cuxxfilt-13-0 100% | 104.9 MiB/s | 1.0 MiB | 00m00s [ 99/116] Installing cuda-cuobjdump-13- 100% | 81.5 MiB/s | 751.3 KiB | 00m00s [100/116] Installing cuda-nvprune-12-9- 100% | 59.2 MiB/s | 181.8 KiB | 00m00s [101/116] Installing cuda-cuxxfilt-12-9 100% | 95.0 MiB/s | 1.0 MiB | 00m00s [102/116] Installing cuda-cuobjdump-12- 100% | 46.5 MiB/s | 666.6 KiB | 00m00s [103/116] Installing rhash-0:1.4.5-3.fc 100% | 14.5 MiB/s | 356.4 KiB | 00m00s [104/116] Installing libuv-1:1.51.0-2.f 100% | 46.6 MiB/s | 573.0 KiB | 00m00s [105/116] Installing jsoncpp-0:1.9.6-2. 100% | 42.2 MiB/s | 259.2 KiB | 00m00s [106/116] Installing cmake-0:3.31.6-4.f 100% | 75.3 MiB/s | 34.5 MiB | 00m00s [107/116] Installing cmake-data-0:3.31. 100% | 71.4 MiB/s | 9.1 MiB | 00m00s [108/116] Installing cuda-compiler-12-9 100% | 0.0 B/s | 124.0 B | 00m00s [109/116] Installing cuda-compiler-13-0 100% | 0.0 B/s | 124.0 B | 00m00s [110/116] Installing cuda-libraries-dev 100% | 0.0 B/s | 124.0 B | 00m00s [111/116] Installing cuda-libraries-dev 100% | 20.2 KiB/s | 124.0 B | 00m00s [112/116] Installing gcc14-c++-0:14.3.1 100% | 107.4 MiB/s | 124.2 MiB | 00m01s [113/116] Installing annobin-plugin-gcc 100% | 75.9 MiB/s | 1.0 MiB | 00m00s [114/116] Installing gcc-plugin-annobin 100% | 3.8 MiB/s | 58.6 KiB | 00m00s [115/116] Installing cuda-nvml-devel-13 100% | 142.1 MiB/s | 1.4 MiB | 00m00s [116/116] Installing cuda-nvml-devel-12 100% | 2.1 MiB/s | 1.4 MiB | 00m01s Warning: skipped OpenPGP checks for 85 packages from repositories: https_developer_download_nvidia_cn_compute_cuda_repos_fedora41_x86_64, https_developer_download_nvidia_cn_compute_cuda_repos_fedora42_x86_64 Complete! Finish: build setup for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Start: rpmbuild ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Building target platforms: x86_64 Building for target x86_64 setting SOURCE_DATE_EPOCH=1759363200 Executing(%mkbuilddir): /bin/sh -e /var/tmp/rpm-tmp.H4cHMg Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.qhB4TX + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + rm -rf ollama-0.12.3 + /usr/lib/rpm/rpmuncompress -x /builddir/build/SOURCES/v0.12.3.tar.gz + STATUS=0 + '[' 0 -ne 0 ']' + cd ollama-0.12.3 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/remove-runtime-for-cuda-and-rocm.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + /usr/lib/rpm/rpmuncompress /builddir/build/SOURCES/replace-library-paths.patch + cp -a /usr/local/cuda-12/ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/ + patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/ patching file include/crt/math_functions.h Hunk #1 succeeded at 2553 with fuzz 1. Hunk #2 succeeded at 2576 with fuzz 1. Hunk #3 succeeded at 2598 with fuzz 1. patch unexpectedly ends in middle of line Hunk #4 succeeded at 2620 with fuzz 1. patching file include/crt/math_functions.h Hunk #1 succeeded at 594 (offset -32 lines). Hunk #2 succeeded at 622 (offset -32 lines). + patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/ + cp -a /usr/local/cuda-13/ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/ + patch -p1 -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/ patching file include/crt/math_functions.h + RPM_EC=0 ++ jobs -p + exit 0 Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.1Olv8r + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd ollama-0.12.3 + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + /usr/bin/cmake -S . -B redhat-linux-build_cuda-13 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 13' -DOLLAMA_RUNNER_DIR=cuda_v13 -DCMAKE_CUDA_COMPILER=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"' Preset CMake variables: CMAKE_BUILD_TYPE="Release" CMAKE_CUDA_ARCHITECTURES="75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual" CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded" -- The C compiler identification is GNU 15.2.1 -- The CXX compiler identification is GNU 15.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- GGML_SYSTEM_ARCH: x86 -- Including CPU backend -- x86 detected -- Adding CPU backend variant ggml-cpu-x64: -- x86 detected -- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42 -- x86 detected -- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX -- x86 detected -- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2 -- x86 detected -- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512 -- x86 detected -- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI -- x86 detected -- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI -- Found CUDAToolkit: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/include (found version "13.0.88") -- CUDA Toolkit found -- Using CUDA architectures: 75-virtual;80-virtual;86-virtual;87-virtual;89-virtual;90-virtual;90a-virtual;100-virtual;110-virtual;120-virtual;121-virtual -- The CUDA compiler identification is NVIDIA 13.0.88 with host compiler GNU 15.2.1 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Looking for a HIP compiler -- Looking for a HIP compiler - NOTFOUND -- Configuring done (8.2s) -- Generating done (0.0s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP INCLUDE_INSTALL_DIR LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 + /usr/bin/cmake --build redhat-linux-build_cuda-13 -j4 --verbose --target ggml-cuda Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j4 ggml-cuda /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/gmake -f CMakeFiles/Makefile2 ggml-cuda gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 47 /usr/bin/gmake -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [ 2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o [ 4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o [ 4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o [ 4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function] 5663 | static void ggml_hash_map_free(struct hash_map * map) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function] 5656 | static struct hash_map * ggml_new_hash_map(size_t size) { | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp [ 6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function] 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid, | ^~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function] 579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min, | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1 /usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o" -lm gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [ 8%] Built target ggml-base /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [ 8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_75,code=[compute_75]" "--generate-code=arch=compute_80,code=[compute_80]" "--generate-code=arch=compute_86,code=[compute_86]" "--generate-code=arch=compute_87,code=[compute_87]" "--generate-code=arch=compute_89,code=[compute_89]" "--generate-code=arch=compute_90,code=[compute_90]" "--generate-code=arch=compute_90a,code=[compute_90a]" "--generate-code=arch=compute_100,code=[compute_100]" "--generate-code=arch=compute_110,code=[compute_110]" "--generate-code=arch=compute_120,code=[compute_120]" "--generate-code=arch=compute_121,code=[compute_121]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o [100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1 /usr/bin/g++ -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/lib/stubs" -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-13/targets/x86_64-linux/lib" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' [100%] Built target ggml-cuda gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13/CMakeFiles 0 gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-13' + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + /usr/bin/cmake -S . -B redhat-linux-build_cuda-12 -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DCMAKE_INSTALL_FULL_SBINDIR:PATH=/usr/bin -DCMAKE_INSTALL_SBINDIR:PATH=bin -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON --preset 'CUDA 12' -DOLLAMA_RUNNER_DIR=cuda_v12 -DCMAKE_CUDA_COMPILER=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -DCMAKE_CUDA_HOST_COMPILER=g++-14 -DCMAKE_CUDA_FLAGS_RELEASE=-DNDEBUG '-DCMAKE_CUDA_FLAGS=-O2 -g -Xcompiler "-fPIC"' Preset CMake variables: CMAKE_BUILD_TYPE="Release" CMAKE_CUDA_ARCHITECTURES="50;60;61;70;75;80;86;87;89;90;90a;120" CMAKE_MSVC_RUNTIME_LIBRARY="MultiThreaded" -- The C compiler identification is GNU 15.2.1 -- The CXX compiler identification is GNU 15.2.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- GGML_SYSTEM_ARCH: x86 -- Including CPU backend -- x86 detected -- Adding CPU backend variant ggml-cpu-x64: -- x86 detected -- Adding CPU backend variant ggml-cpu-sse42: -msse4.2 GGML_SSE42 -- x86 detected -- Adding CPU backend variant ggml-cpu-sandybridge: -msse4.2;-mavx GGML_SSE42;GGML_AVX -- x86 detected -- Adding CPU backend variant ggml-cpu-haswell: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2 GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2 -- x86 detected -- Adding CPU backend variant ggml-cpu-skylakex: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512 -- x86 detected -- Adding CPU backend variant ggml-cpu-icelake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mavx512vbmi;-mavx512vnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX512;GGML_AVX512_VBMI;GGML_AVX512_VNNI -- x86 detected -- Adding CPU backend variant ggml-cpu-alderlake: -msse4.2;-mf16c;-mfma;-mbmi2;-mavx;-mavx2;-mavxvnni GGML_SSE42;GGML_F16C;GGML_FMA;GGML_BMI2;GGML_AVX;GGML_AVX2;GGML_AVX_VNNI -- Found CUDAToolkit: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/include (found version "12.9.86") -- CUDA Toolkit found -- Using CUDA architectures: 50;60;61;70;75;80;86;87;89;90;90a;120 -- The CUDA compiler identification is NVIDIA 12.9.86 with host compiler GNU 14.3.1 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Looking for a HIP compiler -- Looking for a HIP compiler - NOTFOUND -- Configuring done (5.7s) -- Generating done (0.0s) CMake Warning: Manually-specified variables were not used by the project: CMAKE_Fortran_FLAGS_RELEASE CMAKE_INSTALL_DO_STRIP INCLUDE_INSTALL_DIR LIB_SUFFIX SHARE_INSTALL_PREFIX SYSCONF_INSTALL_DIR -- Build files have been written to: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 + /usr/bin/cmake --build redhat-linux-build_cuda-12 -j4 --verbose --target ggml-cuda Change Dir: '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile -j4 ggml-cuda /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/gmake -f CMakeFiles/Makefile2 ggml-cuda gmake[1]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/cmake -S/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 -B/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 --check-build-system CMakeFiles/Makefile.cmake 0 /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 47 /usr/bin/gmake -f CMakeFiles/Makefile2 ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all gmake[2]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/gmake -f ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build.make ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [ 2%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o [ 4%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o [ 4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o [ 4%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o -MF CMakeFiles/ggml-base.dir/ggml.c.o.d -o CMakeFiles/ggml-base.dir/ggml.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o -MF CMakeFiles/ggml-base.dir/ggml.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o -MF CMakeFiles/ggml-base.dir/ggml-alloc.c.o.d -o CMakeFiles/ggml-base.dir/ggml-alloc.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-backend.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-backend.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-alloc.c:4: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5663:13: warning: ‘ggml_hash_map_free’ defined but not used [-Wunused-function] 5663 | static void ggml_hash_map_free(struct hash_map * map) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5656:26: warning: ‘ggml_new_hash_map’ defined but not used [-Wunused-function] 5656 | static struct hash_map * ggml_new_hash_map(size_t size) { | ^~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml.cpp:1: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-opt.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-opt.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-backend.cpp:14: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ [ 6%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -MF CMakeFiles/ggml-base.dir/ggml-threading.cpp.o.d -o CMakeFiles/ggml-base.dir/ggml-threading.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-threading.cpp [ 6%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/gcc -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o -MF CMakeFiles/ggml-base.dir/ggml-quants.c.o.d -o CMakeFiles/ggml-base.dir/ggml-quants.c.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-opt.cpp:6: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:4067:12: warning: ‘iq1_find_best_neighbour’ defined but not used [-Wunused-function] 4067 | static int iq1_find_best_neighbour(const uint16_t * GGML_RESTRICT neighbours, const uint64_t * GGML_RESTRICT grid, | ^~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:579:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function] 579 | static float make_qkx1_quants(int n, int nmax, const float * GGML_RESTRICT x, uint8_t * GGML_RESTRICT L, float * GGML_RESTRICT the_min, | ^~~~~~~~~~~~~~~~ In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-quants.c:5: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘ggml_hash_find_or_insert’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘ggml_hash_insert’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘ggml_hash_contains’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘ggml_bitset_size’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘ggml_set_op_params_f32’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘ggml_set_op_params_i32’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘ggml_get_op_params_f32’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘ggml_get_op_params_i32’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘ggml_set_op_params’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘ggml_are_same_layout’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Building CXX object ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/g++ -DGGML_BACKEND_DL -DGGML_BUILD -DGGML_COMMIT=0x0 -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_base_EXPORTS -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/include -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cpu/amx -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/. -I/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/../include -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -std=c++17 -fPIC -MD -MT ml/backend/ggml/ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o -MF CMakeFiles/ggml-base.dir/gguf.cpp.o.d -o CMakeFiles/ggml-base.dir/gguf.cpp.o -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp In file included from /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/gguf.cpp:3: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:282:15: warning: ‘size_t ggml_hash_find_or_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 282 | static size_t ggml_hash_find_or_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:261:15: warning: ‘size_t ggml_hash_insert(ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 261 | static size_t ggml_hash_insert(struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:256:13: warning: ‘bool ggml_hash_contains(const ggml_hash_set*, ggml_tensor*)’ defined but not used [-Wunused-function] 256 | static bool ggml_hash_contains(const struct ggml_hash_set * hash_set, struct ggml_tensor * key) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:187:15: warning: ‘size_t ggml_bitset_size(size_t)’ defined but not used [-Wunused-function] 187 | static size_t ggml_bitset_size(size_t n) { | ^~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:150:13: warning: ‘void ggml_set_op_params_f32(ggml_tensor*, uint32_t, float)’ defined but not used [-Wunused-function] 150 | static void ggml_set_op_params_f32(struct ggml_tensor * tensor, uint32_t i, float value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:145:13: warning: ‘void ggml_set_op_params_i32(ggml_tensor*, uint32_t, int32_t)’ defined but not used [-Wunused-function] 145 | static void ggml_set_op_params_i32(struct ggml_tensor * tensor, uint32_t i, int32_t value) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:140:14: warning: ‘float ggml_get_op_params_f32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 140 | static float ggml_get_op_params_f32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:135:16: warning: ‘int32_t ggml_get_op_params_i32(const ggml_tensor*, uint32_t)’ defined but not used [-Wunused-function] 135 | static int32_t ggml_get_op_params_i32(const struct ggml_tensor * tensor, uint32_t i) { | ^~~~~~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:129:13: warning: ‘void ggml_set_op_params(ggml_tensor*, const void*, size_t)’ defined but not used [-Wunused-function] 129 | static void ggml_set_op_params(struct ggml_tensor * tensor, const void * params, size_t params_size) { | ^~~~~~~~~~~~~~~~~~ /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-impl.h:77:13: warning: ‘bool ggml_are_same_layout(const ggml_tensor*, const ggml_tensor*)’ defined but not used [-Wunused-function] 77 | static bool ggml_are_same_layout(const struct ggml_tensor * a, const struct ggml_tensor * b) { | ^~~~~~~~~~~~~~~~~~~~ [ 8%] Linking CXX shared library ../../../../../lib/ollama/libggml-base.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-base.dir/link.txt --verbose=1 /usr/bin/g++ -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -Wl,--dependency-file=CMakeFiles/ggml-base.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -Wl,-soname,libggml-base.so -o ../../../../../lib/ollama/libggml-base.so "CMakeFiles/ggml-base.dir/ggml.c.o" "CMakeFiles/ggml-base.dir/ggml.cpp.o" "CMakeFiles/ggml-base.dir/ggml-alloc.c.o" "CMakeFiles/ggml-base.dir/ggml-backend.cpp.o" "CMakeFiles/ggml-base.dir/ggml-opt.cpp.o" "CMakeFiles/ggml-base.dir/ggml-threading.cpp.o" "CMakeFiles/ggml-base.dir/ggml-quants.c.o" "CMakeFiles/ggml-base.dir/gguf.cpp.o" -lm gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [ 8%] Built target ggml-base /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/depend gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/DependInfo.cmake "--color=" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/gmake -f ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build gmake[3]: Entering directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [ 8%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o [ 10%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o -MF CMakeFiles/ggml-cuda.dir/acc.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/acc.cu -o CMakeFiles/ggml-cuda.dir/acc.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o -MF CMakeFiles/ggml-cuda.dir/add-id.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/add-id.cu -o CMakeFiles/ggml-cuda.dir/add-id.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o -MF CMakeFiles/ggml-cuda.dir/arange.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/arange.cu -o CMakeFiles/ggml-cuda.dir/arange.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o -MF CMakeFiles/ggml-cuda.dir/argmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argmax.cu -o CMakeFiles/ggml-cuda.dir/argmax.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future relenvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). ase (Use -Wno-deprecated-gpu-targets to suppress warning). [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o -MF CMakeFiles/ggml-cuda.dir/argsort.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/argsort.cu -o CMakeFiles/ggml-cuda.dir/argsort.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 12%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o -MF CMakeFiles/ggml-cuda.dir/binbcast.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/binbcast.cu -o CMakeFiles/ggml-cuda.dir/binbcast.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o -MF CMakeFiles/ggml-cuda.dir/clamp.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/clamp.cu -o CMakeFiles/ggml-cuda.dir/clamp.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o -MF CMakeFiles/ggml-cuda.dir/concat.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/concat.cu -o CMakeFiles/ggml-cuda.dir/concat.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 14%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o -MF CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv-transpose-1d.cu -o CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-dw.cu -o CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 17%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o -MF CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/conv2d-transpose.cu -o CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o -MF CMakeFiles/ggml-cuda.dir/convert.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/convert.cu -o CMakeFiles/ggml-cuda.dir/convert.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 19%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o -MF CMakeFiles/ggml-cuda.dir/count-equal.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/count-equal.cu -o CMakeFiles/ggml-cuda.dir/count-equal.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o -MF CMakeFiles/ggml-cuda.dir/cpy.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu -o CMakeFiles/ggml-cuda.dir/cpy.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 21%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o -MF CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/cross-entropy-loss.cu -o CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o -MF CMakeFiles/ggml-cuda.dir/diagmask.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/diagmask.cu -o CMakeFiles/ggml-cuda.dir/diagmask.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 23%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile-f32.cu -o CMakeFiles/ggml-cuda.dir/fattn-tile-f32.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu -o CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 25%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o -MF CMakeFiles/ggml-cuda.dir/fattn.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu -o CMakeFiles/ggml-cuda.dir/fattn.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o -MF CMakeFiles/ggml-cuda.dir/getrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/getrows.cu -o CMakeFiles/ggml-cuda.dir/getrows.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 27%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o -MF CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu -o CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o -MF CMakeFiles/ggml-cuda.dir/gla.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/gla.cu -o CMakeFiles/ggml-cuda.dir/gla.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o -MF CMakeFiles/ggml-cuda.dir/im2col.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/im2col.cu -o CMakeFiles/ggml-cuda.dir/im2col.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 29%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o -MF CMakeFiles/ggml-cuda.dir/mean.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mean.cu -o CMakeFiles/ggml-cuda.dir/mean.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmf.cu -o CMakeFiles/ggml-cuda.dir/mmf.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 31%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmq.cu -o CMakeFiles/ggml-cuda.dir/mmq.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvf.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvf.cu -o CMakeFiles/ggml-cuda.dir/mmvf.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 34%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o -MF CMakeFiles/ggml-cuda.dir/mmvq.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu -o CMakeFiles/ggml-cuda.dir/mmvq.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o -MF CMakeFiles/ggml-cuda.dir/norm.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/norm.cu -o CMakeFiles/ggml-cuda.dir/norm.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 36%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o -MF CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/opt-step-adamw.cu -o CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o -MF CMakeFiles/ggml-cuda.dir/out-prod.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/out-prod.cu -o CMakeFiles/ggml-cuda.dir/out-prod.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o -MF CMakeFiles/ggml-cuda.dir/pad.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pad.cu -o CMakeFiles/ggml-cuda.dir/pad.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 38%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o -MF CMakeFiles/ggml-cuda.dir/pool2d.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/pool2d.cu -o CMakeFiles/ggml-cuda.dir/pool2d.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o -MF CMakeFiles/ggml-cuda.dir/quantize.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/quantize.cu -o CMakeFiles/ggml-cuda.dir/quantize.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 40%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o -MF CMakeFiles/ggml-cuda.dir/roll.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/roll.cu -o CMakeFiles/ggml-cuda.dir/roll.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o -MF CMakeFiles/ggml-cuda.dir/rope.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/rope.cu -o CMakeFiles/ggml-cuda.dir/rope.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 42%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o -MF CMakeFiles/ggml-cuda.dir/scale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/scale.cu -o CMakeFiles/ggml-cuda.dir/scale.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o -MF CMakeFiles/ggml-cuda.dir/set-rows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/set-rows.cu -o CMakeFiles/ggml-cuda.dir/set-rows.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 44%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o -MF CMakeFiles/ggml-cuda.dir/softcap.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softcap.cu -o CMakeFiles/ggml-cuda.dir/softcap.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o -MF CMakeFiles/ggml-cuda.dir/softmax.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/softmax.cu -o CMakeFiles/ggml-cuda.dir/softmax.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-conv.cu -o CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 46%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o -MF CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/ssm-scan.cu -o CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o -MF CMakeFiles/ggml-cuda.dir/sum.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sum.cu -o CMakeFiles/ggml-cuda.dir/sum.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 48%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o -MF CMakeFiles/ggml-cuda.dir/sumrows.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/sumrows.cu -o CMakeFiles/ggml-cuda.dir/sumrows.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o -MF CMakeFiles/ggml-cuda.dir/tsembd.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/tsembd.cu -o CMakeFiles/ggml-cuda.dir/tsembd.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 51%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o -MF CMakeFiles/ggml-cuda.dir/unary.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/unary.cu -o CMakeFiles/ggml-cuda.dir/unary.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o -MF CMakeFiles/ggml-cuda.dir/upscale.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/upscale.cu -o CMakeFiles/ggml-cuda.dir/upscale.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o -MF CMakeFiles/ggml-cuda.dir/wkv.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/wkv.cu -o CMakeFiles/ggml-cuda.dir/wkv.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 53%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 55%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 57%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 59%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 61%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 63%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 65%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 68%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 70%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq1_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 72%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 74%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq2_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_s.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq3_xxs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 76%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_nl.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-iq4_xs.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 78%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-mxfp4.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q2_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 80%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q3_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 82%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q4_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 85%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_1.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q5_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 87%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q6_k.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/mmq-instance-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 89%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 91%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 93%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 95%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [ 97%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/bin/nvcc -forward-unknown-to-host-compiler -ccbin=g++-14 -DGGML_BACKEND_BUILD -DGGML_BACKEND_DL -DGGML_BACKEND_SHARED -DGGML_COMMIT=0x0 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SHARED -DGGML_VERSION=0x0 -DNDEBUG -Dggml_cuda_EXPORTS --options-file CMakeFiles/ggml-cuda.dir/includes_CUDA.rsp -O2 -g -Xcompiler "-fPIC" -DNDEBUG -std=c++17 "--generate-code=arch=compute_50,code=[compute_50,sm_50]" "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_90a,code=[compute_90a,sm_90a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIC -use_fast_math -extended-lambda -compress-mode=default -Xcompiler -Wno-pedantic -MD -MT ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o -MF CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o.d -x cu -c /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu -o CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o nvcc warning : Support for offline compilation for architectures prior to '_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [100%] Linking CUDA shared module ../../../../../../lib/ollama/libggml-cuda.so cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/ml/backend/ggml/ggml/src/ggml-cuda && /usr/bin/cmake -E cmake_link_script CMakeFiles/ggml-cuda.dir/link.txt --verbose=1 /usr/bin/g++-14 -fPIC -Wl,--dependency-file=CMakeFiles/ggml-cuda.dir/link.d -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -shared -o ../../../../../../lib/ollama/libggml-cuda.so @CMakeFiles/ggml-cuda.dir/objects1.rsp @CMakeFiles/ggml-cuda.dir/linkLibs.rsp -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib/stubs" -L"/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/cuda-12/targets/x86_64-linux/lib" gmake[3]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' [100%] Built target ggml-cuda gmake[2]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' /usr/bin/cmake -E cmake_progress_start /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12/CMakeFiles 0 gmake[1]: Leaving directory '/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/redhat-linux-build_cuda-12' + RPM_EC=0 ++ jobs -p + exit 0 Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.frFtDr + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + '[' /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT '!=' / ']' + rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT ++ dirname /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + mkdir /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + CFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CFLAGS + CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' + export CXXFLAGS + FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FFLAGS + FCFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' + export FCFLAGS + VALAFLAGS=-g + export VALAFLAGS + RUSTFLAGS='-Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn' + export RUSTFLAGS + LDFLAGS='-Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-hardened-ld-errors -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes ' + export LDFLAGS + LT_SYS_LIBRARY_PATH=/usr/lib64: + export LT_SYS_LIBRARY_PATH + CC=gcc + export CC + CXX=g++ + export CXX + cd ollama-0.12.3 + DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build_cuda-13 --component CUDA -- Install configuration: "Release" -- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v13/libggml-cuda.so" to "" + DESTDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT + /usr/bin/cmake --install redhat-linux-build_cuda-12 --component CUDA -- Install configuration: "Release" -- Installing: /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so -- Set non-toolchain portion of runtime path of "/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/lib64/ollama/cuda_v12/libggml-cuda.so" to "" + /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 0.12.3-1.fc43 --unique-debug-suffix -0.12.3-1.fc43.x86_64 --unique-debug-src-base ollama-ggml-cuda-0.12.3-1.fc43.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3 find-debuginfo: starting Extracting debug info from 2 files DWARF-compressing 2 files sepdebugcrcfix: Updated 2 CRC32s, 0 CRC32s did match. Creating .debug symlinks for symlinks to ELF files Copying sources found by 'debugedit -l' to /usr/src/debug/ollama-ggml-cuda-0.12.3-1.fc43.x86_64 find-debuginfo: done + /usr/lib/rpm/check-buildroot + /usr/lib/rpm/redhat/brp-ldconfig + /usr/lib/rpm/brp-compress + /usr/lib/rpm/redhat/brp-strip-lto /usr/bin/strip + /usr/lib/rpm/check-rpaths + /usr/lib/rpm/redhat/brp-mangle-shebangs + /usr/lib/rpm/brp-remove-la-files + /usr/lib/rpm/redhat/brp-python-rpm-in-distinfo + env /usr/lib/rpm/redhat/brp-python-bytecompile '' 1 0 -j4 + /usr/lib/rpm/redhat/brp-python-hardlink + /usr/bin/add-determinism --brp -j4 /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT Scanned 39 directories and 162 files, processed 0 inodes, 0 modified (0 replaced + 0 rewritten), 0 unsupported format, 0 errors Reading /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/SPECPARTS/rpm-debuginfo.specpart Processing files: ollama-ggml-cuda-13-0.12.3-1.fc43.x86_64 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.pE1mjN + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + cd ollama-0.12.3 + LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13 + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13 + cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-13 + RPM_EC=0 ++ jobs -p + exit 0 Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-13 = 0.12.3-1.fc43 ollama-ggml-cuda-13(x86-64) = 0.12.3-1.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.13()(64bit) libcublas.so.13(libcublas.so.13)(64bit) libcuda.so.1()(64bit) libcudart.so.13()(64bit) libcudart.so.13(libcudart.so.13)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH) Supplements: if libcublas-13-0 ollama-ggml Processing files: ollama-ggml-cuda-12-0.12.3-1.fc43.x86_64 Executing(%license): /bin/sh -e /var/tmp/rpm-tmp.fgRqVh + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + cd ollama-0.12.3 + LICENSEDIR=/builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12 + export LC_ALL=C.UTF-8 + LC_ALL=C.UTF-8 + export LICENSEDIR + /usr/bin/mkdir -p /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12 + cp -pr /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/ollama-0.12.3/ml/backend/ggml/ggml/LICENSE /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT/usr/share/licenses/ollama-ggml-cuda-12 + RPM_EC=0 ++ jobs -p + exit 0 Provides: libggml-cuda.so()(64bit) ollama-ggml-cuda-12 = 0.12.3-1.fc43 ollama-ggml-cuda-12(x86-64) = 0.12.3-1.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Requires: libc.so.6()(64bit) libc.so.6(GLIBC_2.14)(64bit) libc.so.6(GLIBC_2.2.5)(64bit) libc.so.6(GLIBC_ABI_DT_RELR)(64bit) libcublas.so.12()(64bit) libcublas.so.12(libcublas.so.12)(64bit) libcuda.so.1()(64bit) libcudart.so.12()(64bit) libcudart.so.12(libcudart.so.12)(64bit) libgcc_s.so.1()(64bit) libgcc_s.so.1(GCC_3.0)(64bit) libm.so.6()(64bit) libm.so.6(GLIBC_2.27)(64bit) libstdc++.so.6()(64bit) libstdc++.so.6(CXXABI_1.3)(64bit) libstdc++.so.6(CXXABI_1.3.9)(64bit) libstdc++.so.6(GLIBCXX_3.4)(64bit) libstdc++.so.6(GLIBCXX_3.4.11)(64bit) libstdc++.so.6(GLIBCXX_3.4.21)(64bit) libstdc++.so.6(GLIBCXX_3.4.30)(64bit) libstdc++.so.6(GLIBCXX_3.4.32)(64bit) rtld(GNU_HASH) Supplements: if libcublas-12-9 ollama-ggml Processing files: ollama-ggml-cuda-debugsource-0.12.3-1.fc43.x86_64 Provides: ollama-ggml-cuda-debugsource = 0.12.3-1.fc43 ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Processing files: ollama-ggml-cuda-debuginfo-0.12.3-1.fc43.x86_64 Provides: ollama-ggml-cuda-debuginfo = 0.12.3-1.fc43 ollama-ggml-cuda-debuginfo(x86-64) = 0.12.3-1.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43 Processing files: ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc43.x86_64 Provides: debuginfo(build-id) = 756cf1705443b059226e82334d07056834b9e1cd libggml-cuda.so-0.12.3-1.fc43.x86_64.debug()(64bit) ollama-ggml-cuda-13-debuginfo = 0.12.3-1.fc43 ollama-ggml-cuda-13-debuginfo(x86-64) = 0.12.3-1.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43 Processing files: ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc43.x86_64 Provides: debuginfo(build-id) = b39287180b9e5b05eb972c11586ef26cef286bf8 libggml-cuda.so-0.12.3-1.fc43.x86_64.debug()(64bit) ollama-ggml-cuda-12-debuginfo = 0.12.3-1.fc43 ollama-ggml-cuda-12-debuginfo(x86-64) = 0.12.3-1.fc43 Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1 Recommends: ollama-ggml-cuda-debugsource(x86-64) = 0.12.3-1.fc43 Checking for unpackaged file(s): /usr/lib/rpm/check-files /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build/BUILDROOT Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-debuginfo-0.12.3-1.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debugsource-0.12.3-1.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-debuginfo-0.12.3-1.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-debuginfo-0.12.3-1.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-13-0.12.3-1.fc43.x86_64.rpm Wrote: /builddir/build/RPMS/ollama-ggml-cuda-12-0.12.3-1.fc43.x86_64.rpm Executing(rmbuild): /bin/sh -e /var/tmp/rpm-tmp.4sJVuw + umask 022 + cd /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + test -d /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + rm -rf /builddir/build/BUILD/ollama-ggml-cuda-0.12.3-build + RPM_EC=0 ++ jobs -p + exit 0 Finish: rpmbuild ollama-ggml-cuda-0.12.3-1.fc43.src.rpm Finish: build phase for ollama-ggml-cuda-0.12.3-1.fc43.src.rpm INFO: chroot_scan: 1 files copied to /var/lib/copr-rpmbuild/results/chroot_scan INFO: /var/lib/mock/fedora-43-x86_64-1759434727.591343/root/var/log/dnf5.log INFO: chroot_scan: creating tarball /var/lib/copr-rpmbuild/results/chroot_scan.tar.gz /bin/tar: Removing leading `/' from member names INFO: Done(/var/lib/copr-rpmbuild/results/ollama-ggml-cuda-0.12.3-1.fc43.src.rpm) Config(child) 106 minutes 47 seconds INFO: Results and/or logs in: /var/lib/copr-rpmbuild/results INFO: Cleaning up build root ('cleanup_on_success=True') Start: clean chroot INFO: unmounting tmpfs. Finish: clean chroot Finish: run Running RPMResults tool Package info: { "packages": [ { "name": "ollama-ggml-cuda-12", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-13-debuginfo", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "x86_64" }, { "name": "ollama-ggml-cuda", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "src" }, { "name": "ollama-ggml-cuda-debuginfo", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-12-debuginfo", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-13", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "x86_64" }, { "name": "ollama-ggml-cuda-debugsource", "epoch": null, "version": "0.12.3", "release": "1.fc43", "arch": "x86_64" } ] } RPMResults finished